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Preface 


This book is intended for students of the physical sciences, especially physics, who 
have already studied some mechanics as part of an introductory physics course (“fresh¬ 
man physics” at a typical American university) and are now ready for a deeper look at 
the subject. The book grew out of the junior-level mechanics course which is offered 
by the Physics Department at Colorado and is taken mainly by physics majors, but 
also by some mathematicians, chemists, and engineers. Almost all of these students 
have taken a year of freshman physics, and so have at least a nodding acquaintance 
with Newton’s laws, energy and momentum, simple harmonic motion, and so on. In 
this book I build on this nodding acquaintance to give a deeper understanding of these 
basic ideas, and then go on to develop more advanced topics, such as the Lagrangian 
and Hamiltonian formulations, the mechanics of noninertial frames, motion of rigid 
bodies, coupled oscillators, chaos theory, and a few more. 

Mechanics is, of course, the study of how things move — how an electron moves 
down your TV tube, how a baseball flies through the air, how a comet moves round the 
sun. Classical mechanics is the form of mechanics developed by Galileo and Newton in 
the seventeenth century and reformulated by Lagrange and Hamilton in the eighteenth 
and nineteenth centuries. For more than two hundred years, it seemed that classical 
mechanics was the only form of mechanics, that it could explain the motion of all 
conceivable systems. 

Then, in two great revolutions of the early twentieth century, it was shown that 
classical mechanics cannot account for the motion of objects traveling close to the 
speed of light, nor of subatomic particles moving inside atoms. The years from about 
1900 to 1930 saw the development of relativistic mechanics primarily to describe fast- 
moving bodies and of quantum mechanics primarily to describe subatomic systems. 

Faced with this competition, one might expect classical mechanics to have lost much 
of its interest and importance. In fact, however, classical mechanics is now, at the start 
of the twenty-first century, just as important and glamorous as ever. This resilience is 
due to three facts: First, there are just as many interesting physical systems as ever 
that are best described in classical terms. To understand the orbits of space vehicles 
and of charged particles in modem accelerators, you have to understand classical xi 



Preface 


mechanics. Second, recent developments in classical mechanics, mainly associated 
with the growth of chaos theory, have spawned whole new branches of physics and 
mathematics and have changed our understanding of the notion of causality. It is these 
new ideas that have attracted some of the best minds in physics back to the study of 
classical mechanics. Third, it is as true today as ever that a good understanding of 
classical mechanics is a prerequisite for the study of relativity and quantum mechanics. 

Physicists tend to use the term “classical mechanics” rather loosely. Many use it 
for the mechanics of Newton, Lagrange, and Hamilton; for these people, “classical 
mechanics” excludes relativity and quantum mechanics. On the other hand, in some 
areas of physics, there is a tendency to include relativity as a part of “classical me¬ 
chanics”; for people of this persuasion, “classical mechanics” means “non-quantum 
mechanics.” Perhaps as a reflection of this second usage, some courses called “clas¬ 
sical mechanics” include an introduction to relativity, and for the same reason, I have 
included one chapter on relativistic mechanics, which you can use or not, as you 
please. 

An attractive feature of a course in classical mechanics is that it is a wonderful 
opportunity to learn to use many of the mathematical techniques needed in so many 
other branches of physics — vectors, vector calculus, differential equations, complex 
numbers, Taylor series, Fourier series, calculus of variations, and matrices. I have 
tried to give at least a minimal review or introduction for each of these topics (with 
references to further reading) and to teach their use in the usually quite simple context 
of classical mechanics. I hope you will come away from this book with an increased 
confidence that you can really use these important tools. 

Inevitably, there is more material in the book than could possibly be covered in a 
one-semester course. I have tried to ease the pain of choosing what to omit. The book 
is divided into two parts: Part I contains eleven chapters of “essential” material that 
should be read pretty much in sequence, while Part II contains five “further topics” 
that are mutually independent and any of which can be read without reference to the 
others. This division is naturally not very clear cut, and how you use it depends on your 
preparation (or that of your students). In our one-semester course at the University of 
Colorado, I found I needed to work steadily through most of Part I, and I only covered 
Part II by having students choose one of its chapters to study as a term project. (An 
activity they seemed to enjoy.) Some of the professors who taught from a preliminary 
version of the book found their students sufficiently well prepared that they could 
relegate the first four or five chapters to a quick review, leaving more time to cover 
some of Part II. At schools where the mechanics course lasts two quarters, it proved 
possible to cover all of Part I and much of Part II as well. 

Because the chapters of Part II are mutually independent, it is possible to cover 
some of them before you finish Part I. For example, Chapter 12 on chaos could be 
covered immediately after Chapter 5 on oscillations, and Chapter 13 on Hamiltonian 
mechanics could be read immediately after Chapter 7 on Lagrangian mechanics. A 
number of sections are marked with an asterisk to indicate that they can be omitted 
without loss of continuity. (This is not to say that this material is unimportant. I 
certainly hope you’ll come back and read it later!) 

As always in a physics text, it is crucial that you do lots of the exercises at the 
end of each chapter. I have included a large number of these to give both teacher and 
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student plenty of choice. Some of them are simple applications of the ideas of the 
chapter and some are extensions of those ideas. I have listed the problems by section, 
so that as soon as you have read any given section you could (and probably should) try 
a few problems listed for that section. (Naturally, problems listed for a given section 
usually require knowledge of earlier sections. I promise only that you shouldn’t need 
material from later sections.) I have tried to grade the problems to indicate their level 
of difficulty, ranging from one star (★), meaning a straightforward exercise usually 
involving just one main concept, to three stars (★★*), meaning a challenging problem 
that involves several concepts and will probably take considerable time and effort. This 
kind of classification is quite subjective, very approximate, and surprisingly difficult 
to make; I would welcome suggestions for any changes you think should be made. 

Several of the problems require the use of computers to plot graphs, solve differ¬ 
ential equations, and so on. None of these requires any specific software; some can 
be done with a relatively simple system such as MathCad or even just a spreadsheet 
like Excel; some require more sophisticated systems, such as Mathematica, Maple, 
or Matlab. (Incidentally, it is my experience that the course for which this book was 
written is a wonderful opportunity for the students to learn to use one of these fabu¬ 
lously useful systems.) Problems requiring the use of a computer are indicated thus: 
[Computer]. I have tended to grade them as or at least ★★ on the grounds that 
it takes a lot of time to set up the necessary code. Naturally, these problems will be 
easier for students who are experienced with the necessary software. 

Each chapter ends with a summary called “Principal Definitions and Equations 
of Chapter xx” I hope these will be useful as a check on your understanding of the 
chapter as you finish reading it and as a reference later on, as you try to find that 
formula whose details you have forgotten. 

There are many people I wish to thank for their help and suggestions. At the Uni¬ 
versity of Colorado, these include Professors Larry Baggett, John Cary, Mike Dubson, 
Anatoli Levshin, Scott Parker, Steve Pollock, and Mike Ritzwoller. From other institu¬ 
tions, the following professors reviewed the manuscript or used a preliminary edition 
in their classes: 

Meagan Aronson, U of Michigan 
Dan Bloom, Kalamazoo College 
Peter Blunden, U of Manitoba 
Andrew Cleland, UC Santa Barbara 
Gayle Cook, Cal Poly, San Luis Obispo 
Joel Fajans, UC Berkeley 
Richard Fell, Brandeis University 
Gayanath Fernando, U of Connecticut 
Jonathan Friedman, Amherst College 
David Goldhaber-Gordon, Stanford 
Thomas Griffy, U of Texas 
Elisabeth Gwinn, UC Santa Barbara 
Richard Hilt, Colorado College 
George Horton, Rutgers 
Lynn Knutson, U of Wisconsin 
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xiv 


Jonathan Maps, U of Minnesota, Duluth 
John Markert, U of Texas 
Michael Moloney, Rose-Hulman Institute 
Colin Momingstar, Carnegie Mellon 
Declan Mulhall, Cal Poly, San Luis Obispo 
Carl Mungan, US Naval Academy 
Robert Pompi, SUNY Binghamton 
Mark Semon, Bates College 
James Shepard, U of Colorado 
Richard Sonnenfeld, New Mexico Tech 
Edward Stem, U of Washington 
Michael Weinert, U of Wisconsin, Milwaukee 
Alma Zook, Pomona College 

I am most grateful to all of these and their students for their many helpful comments. 
I would particularly like to thank Carl Mungan for his amazing vigilance in catching 
typos, obscurites, and ambiguities, and Jonathan Friedman and his student, Ben 
Heidenreich, who saved me from a really embarassing mistake in Chapter 10.1 am 
especially grateful to my two friends and colleagues, Mark Semon at Bates College 
and Dave Goodmanson at the Boeing Aircraft Company, both of whom reviewed the 
manuscript with the finest of combs and gave me literally hundreds of suggestions; 
likewise to Christopher Taylor of the University of Wisconsin for his patient help 
with Mathematica and the mysteries of Latex. Brace Armbruster and Jane Ellis of 
University Science Books are an author’s dream come true. My copy editor, Lee 
Young, is a rarity indeed, an expert in English usage and physics; he suggested many 
significant improvements. Finally and most of all, I want to thank my wife Debby. 
Being married to an author can be very trying, and she puts up with it most graciously. 
And, as an English teacher with the highest possible standards, she has taught me most 
of what I know about writing and editing. I am eternally grateful. 

For all our efforts, there will surely be several errors in this book, and I would 
be most grateful if you could let me know of any that you find. Ancillary material, 
including an instructors’ manual, and other notices will be posted at the University 
Science Books website, www.uscibooks.com. 


John R. Taylor 
Department of Physics 
University of Colorado 
Boulder, Colorado 80309, USA 
John.Taylor@Colorado.edu 
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Part I of this book contains material that almost everyone would consider essential 
knowledge for an undergraduate physics major. Part II contains optional further topics 
from which you can pick according to your tastes and available time. The distinction 
between “essential” and “optional” is, of course, arguable, and its impact on you, the 
reader, depends very much on your state of preparation. For example, if you are well 
prepared, you might decide that the first five chapters of Part I can be treated as a quick 
review, or even skipped entirely. As a practical matter, the distinction is this: The eleven 
chapters of Part I were designed to be read in sequence, and in writing each chapter, 1 
assumed that you would be familiar with most of the ideas of the preceding chapters — 
either by reading them or because you had met them elsewhere. By contrast, I tried to 
make the chapters of Part II independent of one another, so that you could read any 
of them in any order, once you knew most of the material of Part I. 




CHAPTER 


i 


Newton’s Laws of Motion 


1.1 Classical Mechanics 


Mechanics is the study of how things move: how planets move around the sun, how a 
skier moves down the slope, or how an electron moves around the nucleus of an atom. 

So far as we know, the Greeks were the first to think seriously about mechanics, more 
than two thousand years ago, and the Greeks’ mechanics represents a tremendous step 
in the evolution of modem science. Nevertheless, the Greek ideas were, by modem 
standards, seriously flawed and need not concern us here. The development of the 
mechanics that we know today began with the work of Galileo (1564—1642) and 
Newton (1642-1727), and it is the formulation of Newton, with his three laws of 
motion, that will be our starting point in this book. 

In the late eighteenth and early nineteenth centuries, two alternative formulations 
of mechanics were developed, named for their inventors, the French mathematician 
and astronomer Lagrange (1736-1813) and the Irish mathematician Hamilton (1805- 
1865). The Lagrangian and Hamiltonian formulations of mechanics are completely 
equivalent to that of Newton, but they provide dramatically simpler solutions to 
many complicated problems and are also the taking-off point for various modem 
developments. The term classical mechanics is somewhat vague, but it is generally 
understood to mean these three equivalent formulations of mechanics, and it is in this 
sense that the subject of this book is called classical mechanics. 

Until the beginning of the twentieth century, it seemed that classical mechanics 
was the only kind of mechanics, correctly describing all possible kinds of motion. 

Then, in the twenty years from 1905 to 1925, it became clear that classical mechanics 
did not correctly describe the motion of objects moving at speeds close to the speed of 
light, nor that of the microscopic particles inside atoms and molecules. The result was 
the development of two completely new forms of mechanics: relativistic mechanics 
to describe very high-speed motions and quantum mechanics to describe the motion 
of microscopic particles. I have included an introduction to relativity in the “optional” 

Chapter 15. Quantum mechanics requires a whole separate book (or several books), 

and I have made no attempt to give even a brief introduction to quantum mechanics. 3 
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Although classical mechanics has been replaced by relativistic mechanics and by 
quantum mechanics in their respective domains, there is still a vast range of interesting 
and topical problems in which classical mechanics gives a complete and accurate 
description of the possible motions. In fact, particularly with the advent of chaos 
theory in the last few decades, research in classical mechanics has intensified and the 
subject has become one of the most fashionable areas in physics. The purpose of this 
book is to give a thorough grounding in the exciting field of classical mechanics. When 
appropriate, I shall discuss problems in the framework of the Newtonian formulation, 
but I shall also try to emphasize those situations where the newer formulations of 
Lagrange and Hamilton are preferable and to use them when this is the case. At 
the level of this book, the Lagrangian approach has many significant advantages 
over the Newtonian, and we shall be using the Lagrangian formulation repeatedly, 
starting in Chapter 7. By contrast, the advantages of the Hamiltonian formulation 
show themselves only at a more advanced level, and I shall postpone the introduction 
of Hamiltonian mechanics to Chapter 13 (though it can be read at any point after 
Chapter 7). 

In writing the book, I took for granted that you have had an introduction to 
Newtonian mechanics of the sort included in a typical freshman course in “General 
Physics.” This chapter contains a brief review of the ideas that I assume you have met 
before. 


1.2 Space and Time 


Newton’s three laws of motion are formulated in terms of four crucial underlying 
concepts: the notions of space, time, mass, and force. This section reviews the first 
two of these, space and time. In addition to a brief description of the classical view 
of space and time, I give a quick review of the machinery of vectors, with which we 
label the points of space. 


Space 

Each point P of the three-dimensional space in which we live can be labeled by a 
position vector r which specifies the distance and direction of P from a chosen origin 
O as in Figure 1.1. There are many different ways to identify a vector, of which one 
of the most natural is to give its components ( jc , y, z) in the directions of three chosen 
perpendicular axes. One popular way to express this is to introduce three unit vectors, 
x, y, z, pointing along the three axes and to write 

r = xx + yy + zz. (1.1) 

In elementary work, it is probably wise to choose a single good notation, such as (1.1), 
and stick with it. In more advanced work, however, it is almost impossible to avoid 
using several different notations. Different authors have different preferences (another 
popular choice is to use i, j, k for what I am calling x, y, z) and you must get used 
to reading them all. Furthermore, almost every notation has its drawbacks, which can 
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Figure 1.1 The point P is identified by its position vector r, 
which gives the position of P relative to a chosen origin O. The 
vector r can be specified by its components (x,y,z) relative to 
chosen axes Oxyz. 


make it unusable in some circumstances. Thus, while you may certainly choose your 
preferred scheme, you need to develop a tolerance for several different schemes. 

It is sometimes convenient to be able to abbreviate (1.1) by writing simply 

r — (A-,y,z). (1.2) 

This notation is obviously not quite consistent with (1.1), but it is usually completely 
unambiguous, asserting simply that r is the vector whose components are x, y, z. 
When the notation of (1.2) is the most convenient, I shall not hesitate to use it. For 
most vectors, we indicate the components by subscripts x,y,z. Thus the velocity 
vector v has components v x , v y , v z and the acceleration a has components a x ,a y ,a z . 

As our equations become more complicated, it is sometimes inconvenient to write 
out all three terms in sums like (1.1); one would rather use the summation sign 
followed by a single term. The notation of (1.1) does not lend itself to this shorthand, 
and for this reason I shall sometimes relabel the three components x,y,z of r as 
r x , r 2 , r 3 , and the three unit vectors x, y, z as e 1? e 2 , e 3 . That is, we define 

r x = x, r 2 = y, r 3 = z , 


and 


= x, e 2 = y, e 3 = z. 

(The symbol e is commonly used for unit vectors, since e stands for the German “eins” 
or “one.”) With these notations, (1.1) becomes 

3 

r = r x e x + r 2 e 2 + r 3 e 3 = r.-e,-. (1.3) 

pi 

For a simple equation like this, the form (1.3) has no real advantage over (1.1), but 
with more complicated equations (1.3) is significantly more convenient, and I shall 
use this notation when appropriate. 
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Vector Operations 


In our study of mechanics, we shall make repeated use of the various operations that 
can be performed with vectors. If r and s are vectors with components 

r = (r h r 2 ,r 3 ) and s = Oi, s 2 , s 3 ), 

then their sum (or resultant) r + s is found by adding corresponding components, so 
that 

r + s + s b r 2 + s 2 , r 3 + s 3 ). (1.4) 

(You can convince yourself that this rule is equivalent to the familiar triangle and 
parallelogram rules for vector addition.) An important example of a vector sum is the 
resultant force on an object: When two forces F a and act on an object, the effect 
is the same as a single force, the resultant force, which is just the vector sum 

F = F a + F b 

as given by the vector addition law (1.4). 

If c is a scalar (that is, an ordinary number) and r is a vector, the product cr is 
given by 

cr — ( cr h cr 2 , cr 3 ). (1.5) 

This means that cr is a vector in the same direction 1 as r with magnitude equal to 
c times the magnitude of r. For example, if an object of mass m (a scalar) has an 
acceleration a (a vector), Newton’s second law asserts that the resultant force F on 
the object will always equal the product ma as given by (1.5). 

There are two important kinds of product that can be formed from any pair of 
vectors. First, the scalar product (or dot product) of two vectors r and s is given by 
either of the equivalent formulas 

r >s = rs cos# (1.6) 

3 

= r y s x + r 2 s 2 + r 3 s 3 = ^ r n s n (1.7) 

n =l 

where r and s denote the magnitudes of the vectors r and s, and 6 is the angle between 
them. (For a proof that these two definitions are the same, see Problem 1.7.) For 
example, if a force F acts on an object that moves through a small displacement Jr, 
the work done by the force is the scalar product F • Jr, as given by either (1.6) or (1.7). 
Another important use of the scalar product is to define the magnitude of a vector: 
The magnitude (or length) of any vector r is denoted by |r| or r and, by Pythagoras’s 
theorem is equal to + r 2 + r 3 . By (1.7) this is the same as 

r = |r| = VTTr. (1.8) 

The scalar product r • r is often abbreviated as r 2 . 


1 Although this is what people usually say, one should actually be careful: If c is negative, cr is 
in the opposite direction to r. 
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The second kind of product of two vectors r and s is the vector product (or cross 
product), which is defined as the vector p = rxs with components 


or, equivalently 


Px = r y s z - r z s y 
Py ~ r z s x ~ r x s z 
Pz = r x s y ~ r y s x 


r x s = det 


x 

r x 

S x 


z 

r z 

s z 


(1.9) 


where “det” stands for the determinant. Either of these definitions implies that r x s is 
a vector perpendicular to both r and s, with direction given by the familiar right-hand 
rule and magnitude rs sin# (Problem 1.15). The vector product plays an important 
role in the discussion of rotational motion. For example, the tendency of a force F 
(acting at a point r) to cause a body to rotate about the origin is given by the torque 
of F about O, defined as the vector product T = r x F. 


Differentiation of Vectors 


Many (maybe most) of the laws of physics involve vectors, and most of these involve 
derivatives of vectors. There are so many ways to differentiate a vector that there is 
a whole subject called vector calculus, much of which we shall be developing in the 
course of this book. For now, I shall mention just the simplest kind of vector derivative, 
the time derivative of a vector that depends on time. For example, the velocity \(t) 
of a particle is the time derivative of the particle’s position r(t); that is, v = dr/dt. 
Similarly the acceleration is the time derivative of the velocity, a = d\/dt. 

The definition of the derivative of a vector is closely analogous to that of a scalar. 
Recall that if x(t) is a scalar function of t, then we define its derivative as 


dx Ax 

— — lim — 
dt Ar-»0 At 


where Ax = x(t + At) — x(t) is the change in x as the time advances from t to 
t + At. In exactly the same way, if r(f) is any vector that depends on t, we define its 
derivative as 


dr Ar 

— = lim — 
dt Af—»o At 


( 1 . 10 ) 


where 


Ar = r(t + Ar) - r(t) (1-11) 

is the corresponding change in r(t). There are, of course, many delicate questions 
about the existence of this limit. Fortunately, none of these need concern us here: 
All of the vectors we shall encounter will be differentiable, and you can take for 
granted that the required limits exist. From the definition (1.10), one can prove that 
the derivative has all of the properties one would expect. For example, if r(r) and s (t) 
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are two vectors that depend on t, then the derivative of their sum is just what you 
would expect: 


d dr ds 

— (r + s) = — + —. 

dt dt dt 


( 1 . 12 ) 


Similarly, if r{t ) is a vector and f{t) is a scalar, then the derivative of the product 
f(t)r(t ) is given by the appropriate version of the product rule, 


d dr df 

— (/«•) = /— + ~~ r - 
dt dt dt 


(1.13) 


If you are the sort of person who enjoys proving these kinds of proposition, you might 
want to show that they follow from the definition (1.10). Fortunately, if you do not 
enjoy this kind of activity, you don’t need to worry, and you can safely take these 
results for granted. 

One more result that deserves mention concerns the components of the derivative 
of a vector. Suppose that r, with components x,y,z, is the position of a moving 
particle, and suppose that we want to know the particle’s velocity v = dr/dt. When 
we differentiate the sum 


r = xx + yy + zz. 


(1.14) 


the rule (1.12) gives us the sum of the three separate derivatives, and, by the product 
rule (1.13), each of these contains two terms. Thus, in principle, the derivative of 
(1.14) involves six terms in all. However, the unit vectors x, y, and z do not depend on 
time, so their time derivatives are zero. Therefore, three of these six terms are zero, 
and we are left with just three terms: 


dr dx „ dy „ dz „ 

— - —x H—— v -z 

dt dt dt dt 

Comparing this with the standard expansion 


(1.15) 


v = t^x + v y y + v z z 


we see that 


dy 

v v = —, and 


dz 


(1.16) 


In words, the rectangular components of v are just the derivatives of the corresponding 
components of r. This is a result that we use all the time (usually without even think¬ 
ing about it) in solving elementary mechanics problems. What makes it especially 
noteworthy is this: It is true only because the unit vectors x, y, and z are constant, 
so that their derivatives are absent from (1.15). We shall find that in most coordinate 
systems, such as polar coordinates, the basic unit vectors are not constant, and the 
result corresponding to (1.16) is appreciably less transparent. In problems where we 
need to work in nonrectangular coordinates, it is considerably harder to write down 
velocities and accelerations in terms of the coordinates of r, as we shall see. 
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Time 

The classical view is that time is a single universal parameter t on which all observers 
agree. That is, if all observers are equipped with accurate clocks, all properly syn¬ 
chronized, then they will all agree as to the time at which any given event occurred. 
We know, of course, that this view is not exactly correct: According to the theory of 
relativity, two observers in relative motion do not agree on all times. Nevertheless, 
in the domain of classical mechanics, with all speeds much much less than the speed 
of light, the differences among the measured times are entirely negligible, and I shall 
adopt the classical assumption of a single universal time (except, of course, in Chap¬ 
ter 15 on relativity). Apart from the obvious ambiguity in the choice of the origin of 
time (the time that we choose to label t = 0), all observers agree on the times of all 
events. 

Reference Frames 

Almost every problem in classical mechanics involves a choice (explicit or implicit) 
of a reference frame, that is, a choice of spatial origin and axes to label positions as in 
Figure 1.1 and a choice of temporal origin to measure times. The difference between 
two frames may be quite minor. For instance, they may differ only in their choice 
of the origin of time — what one frame labels t = 0 the other may label t' m t 0 f 0. 
Or the two frames may have the same origins of space and time, but have different 
orientations of the three spatial axes. By carefully choosing your reference frame, 
taking advantage of these different possibilities, you can sometimes simplify your 
work. For example, in problems involving blocks sliding down inclines, it often helps 
to choose one axis pointing down the slope. 

A more important difference arises when two frames are in relative motion; that 
is, when one origin is moving relative to the other. In Section 1.4 we shall find that not 
all such frames are physically equivalent. 2 In certain special frames, called inertial 
frames, the basic laws hold true in their standard, simple form. (It is because one of 
these basic laws is Newton’s first law, the law of inertia, that these frames are called 
inertial.) If a second frame is accelerating or rotating relative to an inertial frame, 
then this second frame is noninertial, and the basic laws — in particular, Newton’s 
laws — do not hold in their standard form in this second frame. We shall find that 
the distinction between inertial and noninertial frames is central to our discussion of 
classical mechanics. It plays an even more explicit role in the theory of relativity. 


1.3 Mass and Force 


The concepts of mass and force are central to the formulation of classical mechanics. 
The proper definitions of these concepts have occupied many philosophers of science 
and are the subject of learned treatises. Fortunately we don’t need to worry much about 


This statement is correct even in the theory of relativity. 
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Figure 1.2 An inertial balance compares the masses m j and m 2 
of two objects that are attached to the opposite ends of a rigid rod. 
The masses are equal if and only if a force applied at the rod’s 
midpoint causes them to accelerate at the same rate, so that the 
rod does not rotate. 


these delicate questions here. Based on your introductory course in general physics, 
you have a reasonably good idea what mass and force mean, and it is easy to describe 
how these parameters are defined and measured in many realistic situations. 


Mass 

The mass of an object characterizes the object’s inertia — its resistance to being 
accelerated: A big boulder is hard to accelerate, and its mass is large. A little stone 
is easy to accelerate, and its mass is small. To make these natural ideas quantitative 
we have to define a unit of mass and then give a prescription for measuring the mass 
of any object in terms of the chosen unit. The internationally agreed unit of mass is 
the kilogram and is defined arbitrarily to be the mass of a chunk of platinum-iridium 
stored at the International Bureau of Weights and Measures outside Paris. To measure 
the mass of any other object, we need a means of comparing masses. In principle, this 
can be done with an inertial balance as shown in Figure 1.2. The two objects to be 
compared are fastened to the opposite ends of a light, rigid rod, which is then given 
a sharp pull at its midpoint. If the masses are equal, they will accelerate equally and 
the rod will move off without rotating; if the masses are unequal, the more massive 
one will accelerate less, and the rod will rotate as it moves off. 

The beauty of the inertial balance is that it gives us a method of mass comparison 
that is based directly on the notion of mass as resistance to being accelerated. In 
practice, an inertial balance would be very awkward to use, and it is fortunate that 
there are much easier ways to compare masses, of which the easiest is to weigh the 
objects. As you certainly recall from your introductory physics course, an object’s 
mass is found to be exactly proportional to the object’s weight 3 (the gravitational force 
on the object) provided all measurements are made in the same location. Thus two 


3 This observation goes back to Galileo’s famous experiments showing that all objects are 
accelerated at the same rate by gravity. The first modem experiments were conducted by the 
Hungarian physicist Eotvos (1848-1919), who showed that weight is proportional to mass to within 
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objects have the same mass if and only if they have the same weight (when weighed 
at the same place), and a simple, practical way to check whether two masses are equal 
is simply to weigh them and see if their weights are equal. 

Armed with methods for comparing masses, we can easily set up a scheme to mea¬ 
sure arbitrary masses. First, we can build a large number of standard kilograms, each 
one checked against the original 1-kg mass using either the inertial or gravitational 
balance. Next, we can build multiples and fractions of the kilogram, again checking 
them with our balance. (We check a 2-kg mass on one end of the balance against 
two 1-kg masses placed together on the other end; we check two half-kg masses by 
verifying that their masses are equal and that together they balance a 1-kg mass; and 
so on.) Finally, we can measure an unknown mass by putting it on one end of the 
balance and loading known masses on the other end until they balance to any desired 
precision. 


Force 

The informal notion of force as a push or pull is a surprisingly good starting point 
for our discussion of forces. We are certainly conscious of the forces that we exert 
ourselves. When I hold up a sack of cement, I am very aware that I am exerting 
an upward force on the sack; when I push a heavy crate across a rough floor, I am 
aware of the horizontal force that I have to exert in the direction of motion. Forces 
exerted by inanimate objects are a little harder to pin down, and we must, in fact, 
understand something of Newton’s laws to identify such forces. If I let go of the sack 
of cement, it accelerates toward the ground; therefore, I conclude that there must be 
another force — the sack’s weight, the gravitational force of the earth — pulling it 
downward. As I push the crate across the floor, I observe that it does not accelerate, 
and I conclude that there must be another force — friction — pushing the crate in the 
opposite direction. One of the most important skills for the student of elementary 
mechanics is to learn to examine an object’s environment and identify all the forces 
on the object: What are the things touching the object and possibly exerting contact 
forces, such as friction or air pressure? And what are the nearby objects possibly 
exerting action-at-a-distance forces, such as the gravitational pull of the earth or the 
electrostatic force of some charged body? 

If we accept that we know how to identify forces, it remains to decide how to 
measure them. As the unit of force we naturally adopt the newton (abbreviated N) 
defined as the magnitude of any single force that accelerates a standard kilogram 
mass with an acceleration of 1 m/s 2 . Having agreed what we mean by one newton, 
we can proceed in several ways, all of which come to the same final conclusion, 
of course. The route that is probably preferred by most philosophers of science is 
to use Newton’s second law to define the general force: A given force is 2 N if, 
by itself, it accelerates a standard kilogram with an acceleration of 2 m/s 2 , and so 


a few parts in 10 9 . Experiments in the last few decades have narrowed this to around one part ii 
10 12 . 
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Figure 1.3 One of many possible ways to define forces of any 
magnitude. The lower spring balance has been calibrated to read 
1 N. If the balance arm on the left is adjusted so that the lever arms 
above and below the pivot are in the ratio 1 : 2 and if the force F x is 
1 N, then the force F 2 required to balance the arm is 2 N. This lets 
us calibrate the upper spring balance for 2 N. By readjusting the two 
lever arms, we can, in principle, calibrate the second spring balance 
to read any force. 


on. This approach is not much like the way we usually measure forces in practice, 4 
and for our present discussion a simpler procedure is to use some spring balances. 
Using our definition of the newton, we can calibrate a first spring balance to read 
1 N. Then by matching a second spring balance against the first, using a balance 
arm as shown in Figure 1.3, we can define multiples and fractions of a newton. 
Once we have a fully calibrated spring balance we can, in principle, measure any 
unknown force, by matching it against the calibrated balance and reading off its 
value. 

So far we have defined only the magnitude of a force. As you are certainly aware, 
forces are vectors, and we must also define their directions. This is easily done. If we 
apply a given force F (and no other forces) to any object at rest, the direction of F is 
defined as the direction of the resulting acceleration, that is, the direction in which the 
body moves off. 

Now that we know, at least in principle, what we mean by positions, times, masses, 
and forces, we can proceed to discuss the cornerstone of our subject — Newton’s three 
laws of motion. 


4 The approach also creates the confusing appearance that Newton’s second law is just a conse¬ 
quence of the definition of force. This is not really tme: Whatever definition we choose for force, 
a large part of the second law is experimental. One advantage of defining forces with spring bal¬ 
ances is that it separates out the definition of force from the experimental basis of the second law. 
Of course, all commonly accepted definitions give the same final result for the value of any given 
force. 
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1.4 Newton’s First and Second Laws; Inertial Frames 


In this chapter, I am going to discuss Newton’s laws as they apply to a point mass. A 
point mass, or particle, is a convenient fiction, an object with mass, but no size, 
that can move through space but has no internal degrees of freedom. It can have 
“translational” kinetic energy (energy of its motion through space) but no energy of 
rotation or of internal vibrations or deformations. Naturally, the laws of motion are 
simpler for point particles than for extended bodies, and this is the main reason that we 
start with the former. Later on, I shall build up the mechanics of extended bodies from 
our mechanics of point particles by considering the extended body as a collection of 
many separate particles. 

Nevertheless, it is worth recognizing that there are many important problems where 
the objects of interest can be realistically approximated as point masses. Atomic and 
subatomic particles can often be considered to be point masses, and even macroscopic 
objects can frequently be approximated in this way. A stone thrown off the top of a 
cliff is, for almost all purposes, a point particle. Even a planet orbiting around the sun 
can usually be approximated in the same way. Thus the mechanics of point masses is 
more than just the starting point for the mechanics of extended bodies; it is a subject 
with wide application itself. 

Newton’s first two laws are well known and easily stated: 


Newton’s First Law (the Law of Inertia) 

In the absence of forces, a particle moves with constant velocity v. 

and 

Newton’s Second Law 

For any particle of mass m, the net force F on the particle is always equal to the 
mass m time£ the particle’s acceleration: 

F = ma. (1.17) 


In this equation F denotes the vector sum of all the forces on the particle and a is the 
particle’s acceleration, 


d\ 
a = — 
dt 


= v 


fr 

dt 2 


Here v denotes the particle’s velocity, and I have introduced the convenient notation 
of dots to denote differentiation with respect to t, as in v = r and a = v = r. 
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Both laws can be stated in various equivalent ways. For instance (the first law): In 
the absence of forces, a stationary particle remains stationary and a moving particle 
continues to move with unchanging speed in the same direction. This is, of course, 
exactly the same as saying that the velocity is always constant. Again, v is constant if 
and only if the acceleration a is zero, so an even more compact statement is this: In 
the absence of forces a particle has zero acceleration. 

The second law can be rephrased in terms of the particle’s momentum, defined as 

p = mv. (1.18) 

In classical mechanics, we take for granted that the mass m of a particle never changes, 
so that 


p = »)( = ma. 

Thus the second law (1.17) can be rephrased to say that 


F = p. 


(1*19) 


In classical mechanics, the two forms (1.17) and (1.19) of the second law are com¬ 
pletely equivalent. 5 


Differential Equations 

When written in the form mr = F, Newton’s second law is a differential equation 
for the particle’s position r(t). That is, it is an equation for the unknown function 
r(t) that involves derivatives of the unknown function. Almost all the laws of physics 
are, or can be cast as, differential equations, and a huge proportion of a physicist’s 
time is spent solving these equations. In particular, most of the problems in this book 
involve differential equations — either Newton’s second law or its counterparts in the 
Lagrangian and Hamiltonian forms of mechanics. These vary widely in their difficulty. 
Some are so easy to solve that one scarcely notices them. For example, consider 
Newton’s second law for a particle confined to move along the x axis and subject 
to a constant force F 0 , 


F 

*(*) = —. 
m 

This is a second-order differential equation for x{t) as a function of t. (Second-order 
because it involves derivatives of second order, but none of higher order.) To solve it 


5 In relativity, the two forms are not equivalent, as we’ll see in Chapter 15. Which form is correct 
depends on the definitions we use for force, mass, and momentum in relativity. If we adopt the most 
popular definitions of these three quantities, then it is the form (1.19) that holds in relativity. 
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one has only to integrate it twice. The first integration gives the velocity 

x(t) = f x(t ) dt = v 0 + —t 
J m 

where the constant of integration is the particle’s initial velocity, and a second inte¬ 
gration gives the position 

x(t) — f x(t) dt — x 0 + v Q t + ^t 2 
J 2m 

where the second constant of integration is the particle’s initial position. Solving this 
differential equation was so easy that we certainly needed no knowledge of the theory 
of differential equations. On the other hand, we shall meet lots of differential equations 
that do require knowledge of this theory, and I shall present the necessary theory as 
we need it. Obviously, it will be an advantage if you have already studied some of the 
theory of differential equations, but you should have no difficulty picking it up as we 
go along. Indeed, many of us find that the best way to learn this kind of mathematical 
theory is in the context of its physical applications. 


Inertial Frames 

On the face of it, Newton’s second law includes his first: If there are no forces on an 
object, then F = 0 and the second law (1.17) implies that a = 0, which is the first law. 
There is, however, an important subtlety, and the first law has an important role to 
play. Newton’s laws cannot be true in all conceivable reference frames. To see this, 
consider just the first law and imagine a reference frame — we’ll call it § — in which 
the first law is true. For example, if the frame S has its origin and axes fixed relative to 
the earth’s surface, then, to an excellent approximation, the first law (the law of inertia) 
holds with respect to the frame §: A frictionless puck placed on a smooth horizontal 
surface is subject to zero force and, in accordance with the first law, it moves with 
constant velocity. Because the law of inertia holds, we call S an inertial frame. If we 
consider a second frame S' which is moving relative to S with constant velocity and is 
not rotating, then the same puck will also be observed to move with constant velocity 
relative to S'. That is, the frame S' is also inertial. 

If, however, we consider a third frame S" that is accelerating relative to S, then, as 
viewed from S", the puck will be seen to be accelerating (in the opposite direction). 
Relative to the accelerating frame S" the law of inertia does not hold, and we say 
that S" is noninertial. I should emphasize that there is nothing mysterious about this 
result. Indeed it is a matter of experience. The frame S' could be a frame attached 
to a high-speed train traveling smoothly at constant speed along a straight track, and 
the frictionless puck, an ice cube placed on the floor of the train, as in Figure 1.4. As 
seen from the train (frame S'), the ice cube is at rest and remains at rest, in accord 
with the first law. As seen from the ground (frame S), the ice cube is moving with the 
same velocity as the train and continues to do so, again in obedience to the first law. 
But now consider conducting the same experiment on a second train (frame S") that 
is accelerating forward. As this train accelerates forward, the ice cube is left behind, 
and, relative to S", the ice cube accelerates backward, even though subject to no net 
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Figure 1.4 The frame S is fixed to the ground, while S' is fixed to a 
train traveling at constant velocity v' relative to S. An ice cube placed 
on the floor of the train obeys Newton’s first law as seen from both S 
and S'. If the train to which S" is attached is accelerating forward, then, 
as seen in S", an ice cube placed on the floor will accelerate backward, 
and the first law does not hold in S". 


force. Clearly the frame S" is noninertial, and neither of the first two laws can hold in 
S". A similar conclusion would hold if the frame S" had been attached to a rotating 
merry-go-round. A frictionless puck, subject to zero net force, would not move in a 
straight line as seen in S", and Newton’s laws would not hold. 

Evidently Newton’s two laws hold only in the special, inertial (nonaccelerating 
and nonrotating) reference frames. Most philosophers of science take the view that 
the first law should be used to identify these inertial frames — a reference frame § is 
inertial if objects that are clearly subject to no forces are seen to move with constant 
velocity relative to S. 6 Having identified the inertial frames by means of Newton’s 
first law, we can then claim as an experimental fact that the second law holds in these 
same inertial frames. 7 

Since the laws of motion hold only in inertial frames, you might imagine that 
we would confine our attention exclusively to inertial frames, and, for a while, 
we shall do just that. Nevertheless, you should be aware that there are situations 
where it is necessary, or at least very convenient, to work in noninertial frames. 
The most important example of a noninertial frame is in fact the earth itself. To an 
excellent approximation, a reference frame fixed to the earth is inertial — a fortunate 
circumstance for students of physics! Nevertheless, the earth rotates on its axis once 
a day and circles around the sun once a year, and the sun orbits slowly around the 
center of the Milky Way galaxy. For all of these reasons, a reference frame fixed to 
the earth is not exactly inertial. Although these effects are very small, there are several 
phenomena — the tides and the trajectories of long-range projectiles are examples — 


6 There is some danger of going in a circle here: How do we know that the object is subject to 
no forces? We’d better not answer, “Because it’s traveling at constant velocity”! Fortunately, we can 
argue that it is possible to identify all sources of force, such as people pushing and pulling or nearby 
massive bodies exerting gravitational forces. If there are no such things around, we can reasonably 
say that the object is free of forces. 

7 As I mentioned earlier, the extent to which the second law is an experimental statement depends 
on how we choose to define force. If we define force by means of the second law, then to some extent 
(though certainly not entirely) the law becomes a matter of definition. If we define forces by means 
of spring balances, then the second law is clearly an experimentally testable proposition. 
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that are most simply explained by taking into account the noninertial character of a 
frame fixed to the earth. In Chapter 9 we shall examine how the laws of motion must 
be modified for use in noninertial frames. For the moment, however, we shall confine 
our discussion to inertial frames. 


Validity of the First Two Laws 

Since the advent of relativity and quantum mechanics, we have known that Newton’s 
laws are not universally valid. Nevertheless, there is an immense range of phenom¬ 
ena— the phenomena of classical physics — where the first two laws are for all 
practical purposes exact. Even as the speeds of interest approach c, the speed of light, 
and relativity becomes important, the first law remains exactly true. (In relativity, 
just as in classical mechanics, an inertial frame is defined as one where the first law 
holds.) 8 As we shall see in Chapter 15, the two forms of the second law, F = ma and 
F = p, are no longer equivalent in relativity, although with F and p suitably defined 
the second law in the form F = p is still valid. In any case, the important point is this: 
In the classical domain, we can and shall assume that the first two laws (the second 
in either form) are universally and precisely valid. You can, if you wish, regard this 
assumption as defining a model — the classical model — of the natural world. The 
model is logically consistent and is such a good representation of many phenomena 
that it is amply worthy of our study. 


1.5 The Third Law and Conservation of Momentum 


Newton’s first two laws concern the response of a single object to applied forces. 
The third law addresses a quite different issue: Every force on an object inevitably 
involves a second object — the object that exerts the force. The nail is hit by the 
hammer , the cart is pulled by the horse, and so on. While this much is no doubt a matter 
of common sense, the third law goes considerably beyond our everyday experience. 
Newton realized that if an object 1 exerts a force on another object 2, then object 2 
always exerts a force (the “reaction” force) back on object 1. This seems quite natural: 
If you push hard against a wall, it is fairly easy to convince yourself that the wall is 
exerting a force back on you, without which you would undoubtedly fall over. The 
aspect of the third law which certainly goes beyond our normal perceptions is this: 
According to the third law, the reaction force of object 2 on object 1 is always equal and 
opposite to the original force of 1 on 2. If we introduce the notation F 2J to denote the 
force exerted on object 2 by object 1, Newton’s third law can be stated very compactly: 


8 However, in relativity the relationship between different inertial frames — the so-called 
Lorentz transformation — is different from that of classical mechanics. See Section 15.6. 
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Figure 1.5 Newton’s third law asserts that the reaction 
force exerted on object 1 by object 2 is equal and opposite 
to the force exerted on 2 by 1, that is, F 12 — F 21 - 


Newton's Third Law 

If object 1 exerts a force F 2I on object 2, then object 2 always exerts a reaction 
force F j2 on object 1 given by 

F 52 = ~F 21 . (1.20) 


This statement is illustrated in Figure 1.5, which you could think of as showing the 
force of the earth on the moon and the reaction force of the moon on the earth (or a 
proton on an electron and the electron on the proton). Notice that this figure actually 
goes a little beyond the usual statement (1.20) of the third law: Not only have I shown 
the two forces as equal and opposite; I have also shown them acting along the line 
joining 1 and 2. Forces with this extra property are called central forces. (They act 
along the line of centers.) The third law does not actually require that the forces be 
central, but, as I shall discuss later, most of the forces we encounter (gravity, the 
electrostatic force between two charges, etc.) do have this property. 

As Newton himself was well aware, the third law is intimately related to the law 
of conservation of momentum. Let us focus, at first, on just two objects as shown 
in Figure 1.6, which might show the earth and the moon or two skaters on the ice. 
In addition to the force of each object on the other, there may be “external” forces 
exerted by other bodies. The earth and moon both experience forces exerted by the 
sun, and both skaters could experience the external force of the wind. I have shown 
the net external forces on the two objects as F® xt and F® xt . The total force on object 1 
is then 


(net force on 1) = F x = F 12 + F® xt 

and similarly 

(net force on 2) = F 2 = F 2 i + F® xt . 

We can compute the rates of change of the particles’ momenta using Newton’s second 
law: 


Pi = Fi = F 12 + Ff 


(1.21) 
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Figure 1.6 Two objects exert forces on each other and 
may also be subject to additional “external” forces from 
other objects not shown. 


and 


P2 = F 2 = F 21 + Ef. ' (1.22) 

If we now define the total momentum of our two objects as 
P = Pi + P2> 

then the rate of change of the total momentum is just 
P = Pi + P2 

To evaluate this, we have only to add Equations (1.21) and (1.22). When we do this, 
the two internal forces, F 12 and F 2 i, cancel out because of Newton’s third law, and we 
are left with 


P = Ff + Ff = F ext , (1.23) 

where I have introduced the notation F ext to denote the total external force on our 
two-particle system. 

The result (1.23) is the first in a series of important results that let us construct a 
theory of many-particle systems from the basic laws for a single particle. It asserts 
that as far as the total momentum of a system is concerned, the internal forces have 
no effect. A special case of this result is that if there are no external forces (F ext = 0) 
then P = 0. Thus we have the important result: 

If F ext = 0, then P = const. (1.24) 

In the absence of external forces, the total momentum of our two-particle system is 
constant — a result called the principle of conservation of momentum. 

Multiparticle Systems 

We have proved the conservation of momentum, Equation (1.24), for a system of two 
particles. The extension of the result to any number of particles is straightforward in 
principle, but I would like to go through it in detail, because it lets me introduce some 
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Figure 1.7 A five-particle system with particles labelled by a or 
ft = 1, 2, • • •, 5. The particle a is subject to four internal forces, 
shown by solid arrows and denoted F ajS (the force on a by fi). In 
addition particle a may be subject to a net external force, shown 
by the dashed arrow and denoted F“ l . 


important notation and will give you some practice using the summation notation. Let 
us consider then a system of N particles. I shall label the typical particle with a Greek 
index a or ft, either of which can take any of the values 1, 2, • • •, N. The mass of 
particle a is m a and its momentum is p a . The force on particle a is quite complicated: 
Each of the other ( N — 1) particles can exert a force which I shall call the force 
on a by fi, as illustrated in Figure 1.7. In addition there may be a net external force 
on particle a, which I shall call F® xt . Thus the net force on particle a is 

(net force on particle a) = F a = ^ F a/3 + F® xt . (1-25) 

Here the sum runs over all values of ft not equal to a. (Remember there is no force 
F aa because particle a cannot exert a force on itself.) According to Newton’s second 
law, this is the same as the rate of change of p a : 

P„ = £ F «» +F T' (1-26) 

This result holds for each a = 1, • • •, N. 

Let us now consider the total momentum of our IV-particle system, 

p = £ p „ 

where, of course, this sum runs over all N particles, a = 1, 2, • • •, N. If we differen¬ 
tiate this equation with respect to time, we find 

*■=£>« 

or, substituting for p a from (1.26), 

p=££ r «# + XX‘- 

a p^a a 


(1.27) 
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The double sum here contains N(N — 1) terms in all. Each term F ajS in this sum can 
be paired with a second term (that is, F 12 paired with F 21 , and so on), so that 

EE^EE^+V)- < L28 > 

a a p>a 

The double sum on the right includes only values of a and fi with a < fi and has half 
as many terms as that on the left. But each term is the sum of two forces, (F a p + F^), 
and, by the third law, each such sum is zero. Therefore the whole double sum in (1.28) 
is zero, and returning to (1.27) we conclude that 

P = ^ F|f = F ext . (1.29) 

The result (1.29) corresponds exactly to the two-particle result (1.23). Like the 
latter, it says that the internal forces have no effect on the evolution of the total 
momentum P — the rate of change of P is determined by the net external force on the 
system. In particular, if the net external force is zero, we have the 

Principle of Conservation of Momentum 
If the net external force F®* on an iV-particle system is zero, the system’s total 
momentum P is constant. 


As you are certainly aware, this is one of the most important results in classical 
physics and is, in fact, also true in relativity and quantum mechanics. If you are not 
very familiar with the sorts of manipulations of sums that we used, it would be a good 
idea to go over the argument leading from (1.25) to (1.29) for the case of three or four 
particles, writing out all the sums explicitly (Problems 1.28 or 1.29). You should also 
convince yourself that, conversely, if the principle of conservation of momentum is 
true for all multiparticle systems, then Newton’s third law must be true (Problem 1.31). 
In other words, conservation of momentum and Newton’s third law are equivalent to 
one another. 


Validity of Newton’s Third Law 

Within the domain of classical physics, the third law, like the second, is valid with 
such accuracy that it can be taken to be exact. As speeds approach the speed of light, 
it is easy to see that the third law cannot hold: The point is that the law asserts that 
the action and reaction forces, F 12 (t) and F 21 (t), measured at the same time t, are 
equal and opposite. As you certainly know, once relativity becomes important the 
concept of a single universal time has to be abandoned — two events that are seen as 
simultaneous by one observer are, in general, not simultaneous as seen by a second 
observer. Thus, even if the equality FizCO — - F 2 i (r) (with both times the same) were 
true for one observer, it would generally be false for another. Therefore, the third law 
cannot be valid once relativity becomes important. 
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Figure 1.8 Each of the positive charges q x and q 2 produces 
a magnetic field that exerts a force on the other charge. The 
resulting magnetic forces F ]2 and F 21 do not obey Newton’s 
third law. 


Rather surprisingly, there is a simple example of a well-known force — the mag¬ 
netic force between two moving charges — for which the third law is not exactly true, 
even at slow speeds. To see this, consider the two positive charges of Figure 1.8, with 
q x moving in the x direction and q 2 moving in the y direction, as shown. The exact 
calculation of the magnetic field produced by each charge is complicated, but a simple 
argument gives the correct directions of the two fields, and this is all we need. The 
moving charge q x is equivalent to a current in the x direction. By the right-hand rule 
for fields, this produces a magnetic field which points in the z direction in the vicinity 
of q 2 . By the right-hand rule for forces, this field produces a force F 2 i on q 2 that is 
in the x direction. An exactly analogous argument (check it yourself) shows that the 
force Fj 2 on q t is in the y direction, as shown. Clearly these two forces do not obey 
Newton’s third law! 

This conclusion is especially startling since we have just seen that Newton’s third 
law is equivalent to the conservation of momentum. Apparently the total momentum 
m jV, + m 2 v 2 of the two charges in Figure 1.8 is not conserved. This conclusion, which 
is correct, serves to remind us that the “mechanical” momentum mv of particles is not 
the only kind of momentum. Electromagnetic fields can also carry momentum, and in 
the situation of Figure 1.8 the mechanical momentum being lost by the two particles 
is going to the electromagnetic momentum of the fields. 

Fortunately, if both speeds in Figure 1.8 are much less than the speed of light 
(v <SC c), the loss of mechanical momentum and the concomitant failure of the third 
law are completely negligible. To see this, note that in addition to the magnetic force 
between q l and q 2 there is the electrostatic Coulomb force 9 kq x q 2 / r 2 , which does obey 
Newton’s third law. It is a straightforward exercise (Problem 1.32) to show that the 
magnetic force is of order v 2 /c 2 times the Coulomb force. Thus only as v approaches 
c — and classical mechanics must give way to relativity anyway — is the violation of 


1 Here k is the Coulomb force constant, often written as k — 1/(4 7te 0 ). 
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the third law by the magnetic force important. 10 We see that the unexpected situation of 
Figure 1.8 does not contradict our claim that in the classical domain Newton’s third 
law is valid, and this is what we shall assume in our discussions of nonrelativistic 
mechanics. 


1.6 Newton’s Second Law in Cartesian Coordinates 


Of Newton’s three laws, the one that we actually use the most is the second, which is 
often described as the equation of motion. As we have seen, the first is theoretically 
important to define what we mean by inertial frames but is usually of no practical 
use beyond this. The third law is crucially important in sorting out the internal forces 
in a multiparticle system, but, once we know the forces involved, the second law is 
what we actually use to calculate the motion of the object or objects of interest. In 
particular, in many simple problems the forces are known or easily found, and, in this 
case, the second law is all we need for solving the problem. 

As we have already noted, the second law, 

F = mr, (1.30) 

is a second-order, differential equation 11 for the position vector r as a function of the 
time t. In the prototypical problem, the forces that comprise F are given, and our job 
is to solve the differential equation (1.30) for r (t). Sometimes we are told about r(f), 
and we have to use (1.30) to find some of the forces. In any case, the equation (1.30) is 
a vector differential equation. And the simplest way to solve such equations is almost 
always to resolve the vectors into their components relative to a chosen coordinate 
system. 

Conceptually the simplest coordinate system is the Cartesian (or rectangular), with 
unit vectors x, y, and z, in terms of which the net force F can then be written as 

F = F x x + F y y + F z i (1.31) 


and the position vector r as 


r = xx + yy + zz. (1.32) 

As we noted in Section 1.2, this expansion of r in terms of its Cartesian components 
is especially easy to differentiate because the unit vectors x, y, z are constant. Thus 
we can differentiate (1.32) twice to get the simple result 

r = xx + yy + zi. (1.33) 


10 The magnetic force between two steady currents is not necessarily small, even in the classical 
domain, but it can be shown that this force does obey the third law. See Problem 1.33. 

11 The force F can sometimes involve derivatives of r. (For instance the magnetic force on a 
moving charge involves the velocity v = r.) Very occasionally the force F involves a higher derivative 
of r, of order n > 2, in which case the second law is an nth-order differential equation. 
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That is, the three Cartesian components of r are just the appropriate derivatives of the 
three coordinates x, y, z of r, and the second law (1.30) becomes 

F x x + F y y + F z z = mx x + my y + mi z. (1-34) 

Resolving this equation into its three separate components, we see that F x has to equal 
mx and similarly for the y and z components. That is, in Cartesian coordinates, the 
single vector equation (1.30) is equivalent to the three separate equations: 

I F X = mx 

F y = my (1.35) 

F z = mi. 

This beautiful result, that, in Cartesian coordinates, Newton’s second law in three 
dimensions is equivalent to three one-dimensional versions of the same law, is the 
basis of the solution of almost all simple mechanics problems in Cartesian coordinates. 
Here is an example to remind you of how such problems go. 


example i.i A Block Sliding down an Incline 

j A block of mass m is observed accelerating from rest down an incline that has 
j coefficient of friction fi and is at angle 0 from the horizontal. How far will it 
j travel in time / ? 

Our first task is to choose our frame of reference. Naturally, we choose our 
spatial origin at the block’s starting position and the origin of time (t = 0) at the 
moment of release. As you no doubt remember from your introductory physics 
j course, the best choice of axes is to have one axis (x say) point down the slope, 
one (y) normal to the slope, and the third (z) across it, as shown in Figure 1.9. 
i This choice has two advantages: First, because the block slides straight down 
the slope, the motion is entirely in the x direction, and only x varies. (If we had 
chosen the x axis horizontal and the y axis vertical, then both x and y would 
j vary.) Second, two of the three forces on the block are unknown (the normal 
| force N and friction f; the weight, w = mg, we treat as known), and with our 
| choice of axes, each of the unknowns has only one nonzero component, since 
] N is in the y direction and f is in the (negative) x direction, 
j We are now ready to apply Newton’s second law. The result (1.35) means 
that we can analyse the three components separately, as follows: 

There are no forces in the z direction, so F z = 0. Since F z — mi, it follows 
j that z = 0, which implies that z (or v z ) is constant. Since the block starts from 
J rest, this means that z is actually zero for all t. With z = 0, it follows that z is 
I constant, and, since it too starts from zero, we conclude that z = 0 for all t. As 
| we would certainly have guessed, the motion remains in the xy plane, 
j Since the block does not jump off the incline, we know that there is no motion 

in the y direction. In particular, y = 0. Therefore, Newton’s second law implies 
that the y component of the net force is zero; that is, F y = 0. From Figure 1.9 
we see that this implies that 


F y = N — mg cos 6 = 0. 
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Figure 1.9 A block slides down a slope of incline 6. The three 
forces on the block are its weight, w = rag, the normal force 
of the incline, N, and the frictional force f, whose magnitude is 
/ = fxN. The z axis is not shown but points out of the page, that 
is, across the slope. 


Thus the y component of the second law has told us that the unknown 
normal force is N — mg cos 6. Since / = | $iN, this tells us the frictional force, 
/ = iimg cos 0, and all the forces are now known. All that remains is to use 
the remaining component (the A component) of the second law to solve for the 
actual motion. 

The x component of the second law, F x = mx, implies (see Figure 1.9) that 
w x — f — mx 
or 

mg sin 0 — /xmg cos 0 = mx. 

The m’s cancel, and we find for the acceleration down the slope 

x = g(sin$ - /xcosO). (1.36) 

Having found x, and found it to be constant, we have only to integrate it twice 
to find x as a function of t. First 

x = g(sin0 — fj, cos 0)t. 

(Remember that x = 0 initially, so the constant of integration is zero.) Finally, 
x(t) = |g(sin0 — iicos0)t 2 

(again the constant of integration is zero) and our solution is complete. 
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Figure 1.10 The definition of the polar coordinates r and 0. 


1.7 Two-Dimensional Polar Coordinates 


While Cartesian coordinates have the merit of simplicity, we are going to find that it is 
almost impossible to solve certain problems without the use of various non-Cartesian 
coordinate systems. To illustrate the complexities of non-Cartesian coordinates, let us 
consider the form of Newton’s second law in a two-dimensional problem using polar 
coordinates. These coordinates are defined in Figure 1.10. Instead of using the two 
rectangular coordinates x , y , we label the position of a particle with its distance r from 
O and the angle 0 measured up from the x axis. Given the rectangular coordinates 
x and y, you can calculate the polar coordinates r and 0, or vice versa, using the 
following relations. (Make sure you understand all four equations. 12 ) 

x = r cos 0 | ^ f r = v / x 2 + y 2 
y = r sin 0 j [ 0 = arctan(y/x) 

Just as with rectangular coordinates, it is convenient to introduce two unit vectors, 
which I shall denote by r and 0. To understand their definitions, notice that we can 
define the unit vector x as the unit vector that points in the direction of increasing x 
when y is fixed, as shown in Figure 1.11(a). In the same way we shall define r as 
the unit vector that points in the direction we move when r increases with 0 fixed; 
likewise, 0 is the unit vector that points in the direction we move when 0 increases 
with r fixed. Figure 1.11 makes clear a most important difference between the unit 
vectors x and y of rectangular coordinates and our new unit vectors r and 0. The 
vectors x and y are the same at all points in the plane, whereas the new vectors r and 
0 change their directions as the position vector r moves around. We shall see that this 
complicates the use of Newton’s second law in polar coordinates. 

Figure 1.11 suggests another way to write the unit vector r. Since r is in the same 
direction as r, but has magnitude 1, you can see that 


r -- 


|rf 


(1.38) 


This result suggests a second role for the “hat” notation. For any vector a, we can 
define a as the unit vector in the direction of a, namely a = a/|a|. 


12 There is a small subtlety concerning the equation for 0: You need to make sure 0 lands in the 
proper quadrant, since the first and third quadrants give the same values for y/x (and likewise the 
second and fourth). See Problem 1.42. 
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Figure 1.11 (a) The unit vector x points in the direction of increas¬ 

ing x with y fixed, (b) The unit vector r points in the direction of 
increasing r with 0 fixed; 0 points in the direction of increasing 0 
with r fixed. Unlike x, the vectors r and 0 change as the position 
vector r moves. 


Since the two unit vectors r and 0 are perpendicular vectors in our two-dimensional 
space, any vector can be expanded in terms of them. For instance, the net force F on 
an object can be written 

F = F r r + F 0 0. (1.39) 

If, for example, the object in question is a stone that I am twirling in a circle on the 
end of a string (with my hand at the origin), then F r would be the tension in the string 
and the force of air resistance retarding the stone in the tangential direction. The 
expansion of the position vector itself is especially simple in polar coordinates. From 
Figure 1.11(b) it is clear that 

r = rr. (1.40) 

We are now ready to ask about the form of Newton’s second law, F = mr, in polar 
coordinates. In rectangular coordinates, we saw that the x component of r is just x, and 
this is what led to the very simple result (1.35). We must now find the components of 
r in polar coordinates; that is, we must differentiate (1.40) with respect to t. Although 

(1.40) is very simple, the vector r changes as r moves. Thus when we differentiate 

(1.40) , we shall pick up a term involving the derivative of r. Our first task is to find 
this derivative of r. 

Figure 1.12(a) shows the position of the particle of interest at two successive times, 
t x and t 2 = t x + At. If the corresponding angles 0(q) and <p(t 2 ) are different, then the 
two unit vectors f (t x ) and r(t 2 ) point in different directions. The change in r is shown 
in Figure 1.12(b), and (provided At is small) is approximately 

Af A0 0 

«0A t0. (1.41) 

(Notice that the direction of Af is perpendicular to r, namely the direction of 0.) If 
we divide both sides by At and take the limit as At -> 0, then Ar/At -> dr/dt and 
we find that 


(1.42) 
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Figure 1.12 (a) The positions of a particle at two successive 

times, t x and t 2 . Unless the particle is moving exactly radially, 
the corresponding unit vectors r(q) and r(t 2 ) point in different 
directions, (b) The change A? in ? is given by the triangle 
shown. 


(For an alternative proof of this important result, see Problem 1.43.) Notice that dr/dt 
is in the direction of 0 and is proportional to the rate of change of the angle 0 — both 
of which properties we would expect based on Figure 1.12. 

Now that we know the derivative of ?, we are ready to differentiate Equation (1.40). 
Using the product rule, we get two terms: 


and, substituting (1.42), we find for the velocity r, or v, 

v = r = rr + r00. (1-43) 

From this we can read off the polar components of the velocity: 

v r = r and v, t) = r4> = rco (1-44) 

where in the second equation I have introduced the traditional notation co for the an¬ 
gular velocity 0. While the results in (1.44) should be familiar from your introductory 
physics course, they are undeniably more complicated than the corresponding results 
in Cartesian coordinates (v x =x and v y m y). 

Before we can write down Newton’s second law, we have to differentiate a second 
time to find the acceleration: 


a = f = — r = — (rr + r00), (1.45) 

dt dt 

where the final expression comes from substituting (1.43) for r. To complete the 
differentiation in (1.45), we must calculate the derivative of 0. This calculation is 
completely analogous to the argument leading to (1.42) and is illustrated in Figure 
1.13. By inspecting this figure, you should be able to convince yourself that 



(1.46) 
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Figure 1.13 (a) The unit vector 0 at two successive times 
t x and t 2 . (b) The change A0. 


Returning to Equation (1.45), we can now carry out the differentiation to give the 
following five terms: 

a = ^rr + + ^(r0 + r0)0 + r< P~jj 

or, if we use (1.42) and (1.46) to replace the derivatives of the two unit vectors, 

a — (r — r0 2 ^ ? + (rip + 2r<j>) <f>. (1-47) 

This horrible result is a little easier to understand if we consider the special case 
that r is constant, as is the case for a stone that I twirl on the end of a string of fixed 
length. With r constant, both derivatives of r are zero, and (1.47) has just two terms: 

a = —r0 2 r + r00 


or 


a = —rco 2 r + r«0, 


where co = <p denotes the angular velocity and a = 4> is the angular acceleration. This 
is the familiar result from elementary physics that when a particle moves around a 
fixed circle, it has an inward “centripetal” acceleration rco 2 (or v 2 /r) and a tangential 
acceleration, ra. Nevertheless, when r is not constant, the acceleration includes all 
four of the terms in (1.47). The first term, r in the radial direction is what you would 
probably expect when r varies, but the final term, 2r0 in the (p direction, is harder 
to understand. It is called the Coriolis acceleration, and I shall discuss it in detail in 
Chapter 9. 

Having calculated the acceleration as in (1.47), we can finally write down Newton’s 
second law in terms of polar coordinates: 


F = ma 


[ F r = m(r — r<}> 2 ) 

{ = m(r0 + 2 r<p). 


(1.48) 


These equations in polar coordinates are a far cry from the beautifully simple equa¬ 
tions (1.35) for rectangular coordinates. In fact, one of the main reasons for taking the 
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trouble to recast Newtonian mechanics in the Lagrangian formulation (Chapter 7) is 
that the latter is able to handle nonrectangular coordinates just as easily as rectangular. 

You may justifiably be feeling that the second law in polar coordinates is so 
complicated that there could be no occasion to use it. In fact, however, there are many 
problems which are most easily solved using polar coordinates, and I conclude this 
section with an elementary example. 


example 1.2 An Oscillating Skateboard 

A “half-pipe” at a skateboard park consists of a concrete trough with a semicircu¬ 
lar cross section of radius R = 5 m, as shown in Figure 1.14.1 hold a frictionless 
skateboard on the side of the trough pointing down toward the bottom and release 
it. Discuss the subsequent motion using Newton’s second law. In particular, if 
I release the board just a short way from the bottom, how long will it take to 
come back to the point of release? 

Because the skateboard is constrained to move on a circular path, this prob¬ 
lem is most easily solved using polar coordinates with origin O at the center of 
the pipe as shown. (At some point in the following calculation, try writing the 
second law in rectangular coordinates and observe what a tangle you get.) With 
this choice of polar coordinates, the coordinate r of the skateboard is constant, 
r = R, and the position of the skateboard is completely specified by the angle 
0. With r constant, the second law (1.48) takes the relatively simple form 

F r = -mR<p 2 (1.49) 

and 

Ffj, = mR<p. (1.50) 

The two forces on the skateboard are its weight w = mg and the normal force N 
of the wall, as shown in Figure 1.14. The components of the net force F = w + N 
are easily seen to be 

F r = mg cos 4> — N and F ^ = —mg sin </>. 


O 



w = mg 


Figure 1.14 A skateboard in a semicircular trough 
of radius R. The board’s position is specified by 
the angle <p measured up from the bottom. The two 
forces on the skateboard are its weight w = mg and 
the normal force N. 
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Substituting for F r into (1.49) we get an equation involving N, 0, and 0. 
Fortunately, we are not really interested in N, and — even more fortunately — 
when we substitute for into (1.50), we get an equation that does not involve 
N at all: 


—mg sin 0 = mR<p 

or, canceling the m’s and rearranging, 

0 — —^ sin0. (1.51) 

Equation (1.51) is the differential equation for 0(f) that determines the 
motion of the skateboard. Qualitatively, we can easily see the kind of motion 
that it implies. First, if 0 = 0, (1.51) says that 0 = 0. Therefore, if we place 
the board at rest (0 = 0) at the point 0 = 0, the board will never move (unless 
someone pushes it); that is, 0 = 0 is an equilibrium position, as you would 
certainly have guessed. Next, suppose that at some time, 0 is not zero and, to 
be definite, suppose that 0 > 0; that is, the skateboard is on the right-hand side 
of the half-pipe. In this case, (1.51) implies that 0 < 0, so the acceleration is 
directed to the left. If the board is moving to the right it must slow down and 
eventually start moving to the left. 13 Once it is moving toward the left, it speeds 
up and returns to the bottom, where it moves over to the left. As soon as the 
board is on the left, the argument reverses (0 < 0, so 0 > 0) and the board must 
eventually return to the bottom and move over to the right again. In other words, 
the differential equation (1.51) implies that the skateboard oscillates back and 
forth, from right to left and back to the right. 

The equation of motion (1.51) cannot be solved in terms of elementary func¬ 
tions, such as polynomials, trigonometric functions, or logs and exponentials. 14 
Thus, if we want more quantitative information about the motion, the simplest 
course is to use a computer to solve it numerically (see Problem 1.50). However, 
if the initial angle 0 O is small, we can use the small angle approximation 

sin0^0 (1-52) 

and, within this approximation, (1.51) becomes 

0 = -^0 (1.53) 

which can be solved using elementary functions. [By this stage, you have al¬ 
most certainly recognized that our discussion of the skateboard problem closely 
parallels the analysis of the simple pendulum. In particular, the small-angle 


13 1 am taking for granted that it doesn’t reach the top and jump out of the trough. Since it was 
released from rest inside the trough, this is correct. Much the easiest way to prove this claim is to 
invoke conservation of energy, which we shan’t be discussing for a while. Perhaps, for now, you 
could agree to accept it as a matter of common sense. 

14 Actually the solution of (1.51) is a Jacobi elliptic function. However, I shall take the point of 
view that for most of us the Jacobi function is not “elementary.” 
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approximation (1.52) is what let you solve the simple pendulum in your intro¬ 
ductory physics course. This parallel is, of course, no accident. Mathematically 
the two problems are exactly equivalent.] If we define the parameter 



then (1.53) becomes 


(1.54) 


(1.55) 


This is the equation of motion for our skateboard in the small-angle approxima- 
j tion. I would like to discuss its solution in some detail to introduce several ideas 
J that we’ 11 be using again and again in what follows. (If you’ve studied differential 
I equations before, just see the next three paragraphs as a quick review.) 
j We first observe that it is easy to find two solutions of the equation (1.55) 
j by inspection (that is, by inspired guessing). The function cp(t) = A sin(ojt) is 
j clearly a solution for any value of the constant A . [Differentiating sin (cot ) brings 
j out a factor of co and changes the sin to a cos; differentiating it again brings 
out another co and changes the cos back to —sin. Thus the proposed solution 
does satisfy cp = —co 2 cp.] Similarly, the function cp(t) = B cos (cot) is another 
solution for any constant B. Furthermore, as you can easily check, the sum of 
these two solutions is itself a solution. Thus we have now found a whole family 
of solutions: 


cp(t) = A sin(&>r) + B cos (cot) (1.56) 

is a solution for any values of the two constants A and B. 

I now want to argue that every solution of the equation of motion (1.55) 
has the form (1.56). In other words, (1.56) is the general solution — we have 
found all solutions, and we need seek no further. To get some idea of why 
this is, note that the differential equation (1.55) is a statement about the second 
derivative cp of the unknown cp. Now, if we had actually been told what cp is, then 
we know from elementary calculus that we could find cp by two integrations, 
and the result would contain two unknown constants — the two constants of 
integration — that would have to be determined by looking (for example) at the 
initial values of cp and cp. In other words, knowledge of cp would tell us that 
cp itself is one of a family of functions containing precisely two undetermined 
constants. Of course, the differential equation (1.55) does not actually tell us 
cp — it is an equation for cp in terms of cp. Nevetheless, it is plausible that such 
an equation would imply that cp is one of a family of functions that contain 
precisely two undetermined constants. If you have studied differential equations, 
you know that this is the case; if you have not, then I must ask you to accept it 
as a plausible fact: For any given second-order differential equation [in a large 
class of “reasonable” equations, including (1.55) and all of the equations we 
shall encounter in this book], the solutions all belong to a family of functions 
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containing precisely two independent constants — like the constants A and B in 
(1.56). (More generally, the solutions of an nth-order equation contain precisely j 
n independent constants.) j 

This theorem sheds a new light on our solution (1.56). We already knew that j 
any function of the form (1.56) is a solution of the equation of motion. Our 
theorem now guarantees that every solution of the equation of motion is of this 
form. This same argument applies to all the second-order differential equations 
we shall encounter. If, by hook or by crook, we can find a solution like (1.56) 
involving two arbitrary constants, then we are guaranteed that we have found 
the general solution of our equation. 

All that remains is to pin down the two constants A and B for our skateboard. 

To do so, we must look at the initial conditions. At t = 0, Equation (1.56) implies 
that (p — B. Therefore B is just the initial value of <p , which we are calling <p 0 , 
so B = (p Q . At t = 0, Equation (1.56) implies that 0 = coA. Since I released the 
board from rest, this means that A = 0, and our solution is 

<p(t) =<p 0 cos(cot). (1.57) j 

The first thing to note about this solution is that, as we anticipated on general 
grounds, <p(t) oscillates, moving from positive to negative and back to positive 
periodically and indefinitely. In particular, the board first returns to its initial 
position <p 0 when cot = 2 tt. The time that this takes is called the period of 
the motion and is denoted r. Thus our conclusion is that the period of the 
skateboard’s oscillations is j 


(1.58) J 

We were given that R = 5 m, and g = 9.8m/s 2 . Substituting these numbers, 
we conclude that the skateboard returns to its starting point in a time x = 4.5 
seconds. | 



Principal Definitions and Equations of Chapter 1 

Dot and Cross Products 


r • s = rs cos 0 = r x s x + r s y + r z s z [Eqs. (1.6) & (1.7)] 


r x s = ( r y s z — r z s y , r z s x — r x s z , r x s y — r y s x ) = det 


x y z 

r x r y r z 

S x S v S z 


[Eq. (1.9)] 
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Inertial Frames 

An inertial frame is any reference frame in which Newton’s first law holds, that is, a 
nonaccelerating, nonrotating frame. 


Unit Vectors of a Coordinate System 

If (£, rj, £) are an orthogonal system of coordinates, then 

| = unit vector in direction of increasing £ with rj and f fixed 
and so on, and any vector s can be expanded as s = s £ % + s n q + 


Newton’s Second Law in Various Coordinate Systems 


Vector Form Cartesian 

_ (x, y, z) 

! F x = mx 
F y — my 
F z = mz 
Eq. (1.35) 


2D Polar 

_ (r,4>) _ 

| F r — m(f - r<p 2 ) 

1 F ( p = m(r(j) + 2r<p) 

Eq. (1.48) 


Cylindrical Polar 

[ F r =m(p - ptf 2 ). 

| = m(j><j> + 2 pj>) 

[ F z = mz 

Problem 1.47 or 1.48 


Problems for Chapter 1 _ 

The problems for each chapter are arranged according to section number. A problem listed for a given 
section requires an understanding of that section and earlier sections, but not of later sections. Within each 
section problems are listed in approximate order of difficulty. A single star (★) indicates straightforward 
problems involving just one main concept. Two stars (irk) identify problems that are slightly more challenging 
and usually involve more than one concept. Three stars (★**) indicate problems that are distinctly more 
challenging, either because they are intrinsically difficult or involve lengthy calculations. Needless to say, 
these distinctions are hard to draw and are only approximate. 

Problems that need the use of a computer are flagged thus: [Computer]. These are mostly classified as 
kkk on the grounds that it usually takes a long time to set up the necessary code — especially if you’re just 
learning the language. 

section i .2 Space and Time 

1.1 * Given the two vectors b = x + y and c = x + z find b + c, 5b + 2c, b • c, and b x c. 

1.2 * Two vectors are given as b = (1,2, 3) and c = (3, 2,1). (Remember that these statements are just 
a compact way of giving you the components of the vectors.) Find b + c, 5b — 2c, b • c, and b x c. 

1.3 ★ By applying Pythagoras’s theorem (the usual two-dimensional version) twice over, prove that 
the length r of a three-dimensional vector r = (jc, y, z ) satisfies r 2 = x 2 + y 2 + z 2 . 
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1.4 * One of the many uses of the scalar product is to find the angle between two given vectors. Find 
the angle between the vectors b m (1, 2, 4) and c = (4, 2, 1) by evaluating their scalar product. 

1.5 * Find the angle between a body diagonal of a cube and any one of its face diagonals. [Hint: Choose 
a cube with side 1 and with one comer at O and the opposite comer at the point (1,1,1). Write down 
the vector that represents a body diagonal and another that represents a face diagonal, and then find the 
angle between them as in Problem 1.4.] 

1.6* By evaluating their dot product, find the values of the scalar s for which the two vectors 
b = x + sy and c = x — sy are orthogonal. (Remember that two vectors are orthogonal if and only 
if their dot product is zero.) Explain your answers with a sketch. 

1.7 ★ Prove that the two definitions of the scalar product r • s as rs cos 6 ( 1.6) and J2 r i s t (1 -7) are equal. 
One way to do this is to choose your jt axis along the direction of r. [Strictly speaking you should first 
make sure that the definition (1.7) is independent of the choice of axes. If you like to worry about such 
niceties, see Problem 1.16.] 

1.8 ★ (a) Use the definition (1.7) to prove that the scalar product is distributive, that is, r • (u + v) = 
r • u + r • v. (b) If r and s are vectors that depend on time, prove that the product rale for differentiating 
products applies to r • s, that is, that 


d . . ds 

— (r • s) = r- 

dt dt 


dr 

— -s. 
dt 


1.9 ★ In elementary trigonometry, you probably learned the law of cosines for a triangle of sides a, b, 
and c, that c 2 = a 2 + b 2 — lab cos 9, where 0 is the angle between the sides a and b. Show that the 
law of cosines is an immediate consequence of the identity (a + b) 2 = a 2 + b 2 + 2a • b. 

1.10 * A particle moves in a circle (center O and radius R) with constant angular velocity co counter¬ 
clockwise. The circle lies in the xy plane and the particle is on the x axis at time t — 0. Show that the 
particle’s position is given by 


r (t) = xR cos(tot) + y R sin (cot). 

Find the particle’s velocity and acceleration. What are the magnitude and direction of the acceleration? 
Relate your results to well-known properties of uniform circular motion. 

1.11 * The position of a moving particle is given as a function of time t to be 

r(t) = xb cos(ftjt) + yc sin(cut), 
where b , c, and co are constants. Describe the particle’s orbit. 

1.12 ★ The position of a moving particle is given as a function of time t to be 

r(t) = xb cos (cot) + yc sin(ft)t) + z v 0 t 
where b, c, v 0 and co are constants. Describe the particle’s orbit. 

1.13 * Let u be an arbitrary fixed unit vector and show that any vector b satisfies 

b 2 — (u *b) 2 + (u x b) 2 . 

Explain this result in words, with the help of a picture. 
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1.14 * Prove that for any two vectors a and b, 


|a + b| < (a + b). 


[Hint: Work out |a + b| 2 and compare it with (a + b) 2 .] Explain why this is called the triangle 
inequality. 

1.15 ★ Show that the definition (1.9) of the cross product is equivalent to the elementary definition that 
r x s is perpendicular to both r and s, with magnitude rs sin 9 and direction given by the right-hand 
rule. [Hint: It is a fact (though quite hard to prove) that the definition (1.9) is independent of your choice 
of axes. Therefore you can choose axes so that r points along the x axis and s lies in the xy plane.] 

1.16 ** (a) Defining the scalar product r • s by Equation (1.7), r -s = ^ r i s i > show that Pythagoras’s 
theorem implies that the magnitude of any vector r is r = V r * r (b) It is clear that the length of a 
vector does not depend on our choice of coordinate axes. Thus the result of part (a) guarantees that the 
scalar product r • r, as defined by (1.7), is the same for any choice of orthogonal axes. Use this to prove 
that r • s, as defined by (1.7), is the same for any choice of orthogonal axes. [Hint: Consider the length 
of the vector r + s.] 

1.17 *★ (a) Prove that the vector product r x s as defined by (1.9) is distributive; that is, that rx(u + 
v) = (r x u) + (r x v). (b) Prove the product rule 


d , . ds dr 

— (r xs) = rx-|-> 

dt dt dt 


Be careful with the order of the factors. 


1.18 ** The three vectors a, b, c are the three sides of the triangle ABC with angles a, /3, y as shown 
in Figure 1.15. (a) Prove that the area of the triangle is given by any one of these three expressions: 

area = ||a x b] = ||b x c| = ||c x a|. 

(b) Use the equality of these three expressions to prove the so-called law of sines, that 

a _ b _ c 

sin a sin sin y 



Figure 1.15 Triangle for Problem 1.18. 
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1.19 ** If r, v, a denote the position, velocity, and acceleration of a particle, prove that 

~[a • (v x r)] = a-(v x r). 
dt 

1.20 ** The three vectors A, B, C point from the origin O to the three comers of a triangle. Use the 
result of Problem 1.18 to show that the area of the triangle is given by 

(area of triangle) = ]|(B x C) + (C x A) + (A x B)|. 

1.21 ** A parallelepiped (a six-faced solid with opposite faces parallel) has one comer at the origin 
O and the three edges that emanate from O defined by vectors a, b, c. Show that the volume of the 
parallelepiped is |a • (b x c)|. 

1.22 ** The two vectors a and b lie in the xy plane and make angles a and f) with the x axis, (a) By 
evaluating a • b in two ways [namely using (1.6) and (1.7)] prove the well-known trig identity 

cos(a — f) = cos a cos f) + sin a sin fi. 

(b) By similarly evaluating a x b prove that 

sin(a — ft) = sin a cos f — cos a sin /?. 

1.23 ** The unknown vector v satisfies b • v = % and b x v = c, where k, b, and c are fixed and known. 
Find v in terms of k, b, and c. 

section 1.4 Newton’s First and Second Laws; Inertial Frames 

1.24 ★ In case you haven’t studied any differential equations before, I shall be introducing the necessary 
ideas as needed. Here is a simple excercise to get you started: Find the general solution of the first- 
order equation df/dt = / for an unknown function f(t). [There are several ways to do this. One is 
to rewrite the equation as df/f = dt and then integrate both sides.] How many arbitrary constants 
does the general solution contain? [Your answer should illustrate the important general theorem that 
the solution to any nth-order differential equation (in a very large class of “reasonable” equations) 
contains n arbitrary constants.] 

1.25 * Answer the same questions as in Problem 1.24, but for the differential equation df/dt — —3/. 

1.26 ** The hallmark of an inertial reference frame is that any object which is subject to zero net force 
will travel in a straight line at constant speed. To illustrate this, consider the following: I am standing 
on a level floor at the origin of an inertial frame § and kick a frictionless puck due north across the 
floor, (a) Write down the x and y coordinates of the puck as functions of time as seen from my inertial 
frame. (Use x and y axes pointing east and north respectively.) Now consider two more observers, the 
first at rest in a frame S' that travels with constant velocity v due east relative to 8, the second at rest 
in a frame 8" that travels with constant acceleration due east relative to 8. (All three frames coincide 
at the moment when I kick the puck, and 8" is at rest relative to 8 at that same moment.) (b) Find the 
coordinates x', y' of the puck and describe the puck’s path as seen from S', (c) Do the same for 8". 
Which of the frames is inertial? 

1.27 ★* The hallmark of an inertial reference frame is that any object which is subject to zero net force 
will travel in a straight line at constant speed. To illustrate this, consider the following experiment: I am 
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standing on the ground (which we shall take to be an inertial frame) beside a perfectly flat horizontal 
turntable, rotating with constant angular velocity co. I lean over and shove a frictionless puck so that 
it slides across the turntable, straight through the center. The puck is subject to zero net force and, as 
seen from my inertial frame, travels in a straight line. Describe the puck’s path as observed by someone 
sitting at rest on the turntable. This requires careful thought, but you should be able to get a qualitative 
picture. For a quantitative picture, it helps to use polar coordinates; see Problem 1.46. 

section 1.5 The Third Law and Conservation of Momentum 

1.28* Go over the steps from Equation (1.25) to (1.29) in the proof of conservation of momentum, 
but treat the case that N = 3 and write out all the summations explicitly to be sure you understand the 
various manipulations. 

1.29 * Do the same tasks as in Problem 1.28 but for the case of four particles (N — 4). 

1.30* Conservation laws, such as conservation of momentum, often give a surprising amount of 
information about the possible outcome of an experiment. Here is perhaps the simplest example: Two 
objects of masses m x and m 2 are subject to no external forces. Object 1 is traveling with velocity v 
when it collides with the stationary object 2. The two objects stick together and move off with common 
velocity v ; . Use conservation of momentum to find v' in terms of v, m x , and m 2 . 

1.31* In Section 1.5 we proved that Newton’s third law implies the conservation of momentum. 
Prove the converse, that if the law of conservation of momentum applies to every possible group of 
particles, then the interparticle forces must obey the third law. [Hint: However many particles your 
system contains, you can focus your attention on just two of them. (Call them 1 and 2.) The law of 
conservation of momentum says that if there are no external forces on this pair of particles, then their 
total momentum must be constant. Use this to prove that F 12 = — F 2 i-] 

1.32** If you have some experience in electromagnetism, you could do the following problem 
concerning the curious situation illustrated in Figure 1.8. The electric and magnetic fields at a point r, 
due to a charge q 2 at r 2 moving with constant velocity v 2 (with v 2 <?C c) are 15 

E(r,) = —s and B(rj) = ^ % v 2 x s 
47re 0 s z An s z 

where s = — r 2 is the vector pointing from r 2 to r,. (The first of these you should recognize as 

Coulomb’s law.) If F®!, and F™ 8 denote the electric and magnetic forces on a charge q x at r x with velocity 
v b show that F t “ ag < (v x v 2 /c 2 )Ff r This shows that in the non-relativistic domain it is legitimate to 
ignore the magnetic force between two moving charges. 

1.33 *** If you have some experience in electromagnetism and with vector calculus, prove that the 
magnetic forces, F 12 and F 21 , between two steady current loops obey Newton’s third law. [Hints: Let 
the two currents be Ij and / 2 and let typical points on the two loops be rj and r 2 . If dr x and dv 2 are 
short segments of the loops, then according to the Biot-Savart law, the force on dr x due to dr 2 is 

j- 2 - dr x x (dr 2 x s) 

4nr s z 

where s = — r 2 . The force F, 2 is found by integrating this around both loops. You will need to use 

the U BAC — CAB ” rule to simplify the triple product.] 


’ See, for example, David J. Griffiths, Introduction to Electrodynamics, 3rd ed., Prentice Hall, (1999), p. 440. 
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1.34 **★ Prove that in the absence of external forces, the total angular momentum (defined as L = 
J2a r o! x Pa) °f an A'-particle system is conserved. [Hints: You need to mimic the argument from 
(1.25) to (1.29). In this case you need more than Newton’s third law: In addition you need to assume 
that the interparticle forces are central ; that is, F a p acts along the line joining particles a and A full 
discussion of angular momentum is given in Chapter 3.] 


section 1.6 Newton’s Second Law in Cartesian Coordinates 

1.35 * A golf ball is hit from ground level with speed v 0 in a direction that is due east and at an angle 
9 above the horizontal. Neglecting air resistance, use Newton’s second law (1.35) to find the position 
as a function of time, using coordinates with x measured east, y north, and z vertically up. Find the 
time for the golf ball to return to the ground and how far it travels in that time. 

1.36 * A plane, which is flying horizontally at a constant speed v Q and at a height h above the sea, 
must drop a bundle of supplies to a castaway on a small raft, (a) Write down Newton’s second law 
for the bundle as it falls from the plane, assuming you can neglect air resistance. Solve your equations 
to give the bundle’s position in flight as a function of time t . (b) How far before the raft (measured 
horizontally) must the pilot drop the bundle if it is to hit the raft? What is this distance if v 0 = 50 m/s, 
h = 100 m, and g & 10 m/s 2 ? (c) Within what interval of time (±A/) must the pilot drop the bundle if 
it is to land within ±10 m of the raft? 

1.37 * A student kicks a frictionless puck with initial speed v 0 , so that it slides straight up a plane that 
is inclined at an angle 9 above the horizontal, (a) Write down Newton’s second law for the puck and 
solve to give its position as a function of time, (b) How long will the puck take to return to its starting 
point? 

1.38 ★ You lay a rectangular board on the horizontal floor and then tilt the board about one edge until 
it slopes at angle 9 with the horizontal. Choose your origin at one of the two comers that touch the 
floor, the x axis pointing along the bottom edge of the board, the y axis pointing up the slope, and 
the z axis normal to the board. You now kick a frictionless puck that is resting at O so that it slides 
across the board with initial velocity (v ox , v oy , 0). Write down Newton’s second law using the given 
coordinates and then find how long the puck takes to return to the floor level and how far it is from O 
when it does so. 

1.39 ★★ A ball is thrown with initial speed v Q up an inclined plane. The plane is inclined at an angle 
0 above the horizontal, and the ball’s initial velocity is at an angle 9 above the plane. Choose axes 
with x measured up the slope, y normal to the slope, and z across it. Write down Newton’s second law 
using these axes and find the ball’s position as a function of time. Show that the ball lands a distance 
R — 2u 2 sin 9 cos (9 + 0)/(g cos 2 0) from its launch point. Show that for given v () and 0, the maximum 
possible range up the inclined plane is R max = v 2 /[g(l + sin0)]. 

1.40 *** A cannon shoots a ball at an angle 9 above the horizontal ground, (a) Neglecting air resistance, 
use Newton’s second law to find the ball’s position as a function of time. (Use axes with x measured 
horizontally and y vertically.) (b) Let r(t) denote the ball’s distance from the cannon. What is the 
largest possible value of 9 if r (t) is to increase throughout the ball’s flight? [Hint: Using your solution 
to part (a) you can write down r 2 as x 2 + y 2 , and then find the condition that r 2 is always increasing.] 
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section 1.7 Two-Dimensional Polar Coordinates 

1.41 * An astronaut in gravity-free space is twirling a mass m on the end of a string of length R in a 
circle, with constant angular velocity co. Write down Newton’s second law (1.48) in polar coordinates 
and find the tension in the string. 

1.42 * Prove that the transformations from rectangular to polar coordinates and vice versa are given 
by the four equations (1.37). Explain why the equation for 0 is not quite complete and give a complete 
version. 

1.43 ★ (a) Prove that the unit vector r of two-dimensional polar coordinates is equal to 

r = xcos0 + y sin0 (1-59) 

and find a corresponding expression for 0. (b) Assuming that 0 depends on the time t, differentiate your 
answers in part (a) to give an alternative proof of the results (1.42) and (1.46) for the time derivatives 
r and 0. 

1.44* Verify by direct substitution that the function 0(t) = A sin(u)t) + B cos(tDt) of (1.56) is a 
solution of the second-order differential equation (1.55), 0 = —&j 2 0. (Since this solution involves 
two arbitrary constants — the coefficients of the sine and cosine functions — it is in fact the general 
solution.) 

1.45 ★★ Prove that if v(t) is any vector that depends on time (for example the velocity of a moving 
particle) but which has constant magnitude, then \(t) is orthogonal to \(t). Prove the converse that if 
v(t) is orthogonal to \(t), then |v(t)| is constant. [Hint: Consider the derivative of v 2 .] This is a very 
handy result. It explains why, in two-dimensional polars, dr/dt has to be in the direction of 0 and 
vice versa. It also shows that the speed of a charged particle in a magnetic field is constant, since the 
acceleration is perpendicular to the velocity. 

1.46 ** Consider the experiment of Problem 1.27, in which a frictionless puck is slid straight across 
a rotating turntable through the center O. (a) Write down the polar coordinates r, 0 of the puck as 
functions of time, as measured in the inertial frame S of an observer on the ground. (Assume that the 
puck was launched along the axis 0 = 0 at t = 0.) (b) Now write down the polar coordinates r', 0' of 
the puck as measured by an observer (frame S') at rest on the turntable. (Choose these coordinates so 
that 0 and 0' coincide at t = 0.) Describe and sketch the path seen by this second observer. Is the frame 
S' inertial? 

1.47 *★ Let the position of a point P in three dimensions be given by the vector r = (x, y, z) in 

rectangular (or Cartesian) coordinates. The same position can be specified by cylindrical polar 
coordinates, p, 0, z, which are defined as follows: Let P' denote the projection of P onto the xy 
plane; that is, P' has Cartesian coordinates (x, y, 0). Then p and 0 are defined as the two-dimensional 
polar coordinates of P' in the xy plane, while z is the third Cartesian coordinate, unchanged, (a) Make 
a sketch to illustrate the three cylindrical coordinates. Give expressions for p, 0, z in terms of the 
Cartesian coordinates x, y, z. Explain in words what p is (“p is the distance of P from ”). There 

are many variants in notation. For instance, some people use r instead of p. Explain why this use of r is 
unfortunate, (b) Describe the three unit vectors p, 0, z and write the expansion of the position vector r 
in terms of these unit vectors, (c) Differentiate your last answer twice to find the cylindrical components 
of the acceleration a = r of the particle. To do this, you will need to know the time derivatives of p 
and 0. You could get these from the corresponding two-dimensional results (1.42) and (1.46), or you 
could derive them directly as in Problem 1.48. 
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1.48 *★ Find expressions for the unit vectors p, <j), and z of cylindrical polar coordinates (Problem 
1.47) in terms of the Cartesian x, y, z. Differentiate these expressions with respect to time to find 
dp/dt, d<j>/dt, and di/dt. 

1.49 ** Imagine two concentric cylinders, centered on the vertical z axis, with radii R ± e, where e is 
very small. A small frictionless puck of thickness 2e is inserted between the two cylinders, so that it 
can be considered a point mass that can move freely at a fixed distance from the vertical axis. If we use 
cylindrical polar coordinates (p , <p , z) for its position (Problem 1.47), then p is fixed at p = R, while <p 
and z can vary at will. Write down and solve Newton’s second law for the general motion of the puck, 
including the effects of gravity. Describe the puck’s motion. 

1.50 [Computer] The differential equation (1.51) for the skateboard of Example 1.2 cannot be 
solved in terms of elementary functions, but is easily solved numerically, (a) If you have access to 
software, such as Mathematica, Maple, or Matlab, that can solve differential equations numerically, 
solve the differential equation for the case that the board is released from </> 0 = 20 degrees, using the 
values R = 5 m and g = 9.8 m/s 2 . Make a plot of <f> against time for two or three periods, (b) On the 
same picture, plot the approximate solution (1.57) with the same <p 0 = 20°. Comment on your two 
graphs. Note: If you haven’t used the numerical solver before, you will need to learn the necessary 
syntax. For example, in Mathematica you will need to learn the syntax for “NDSolve” and how to plot 
the solution that it provides. This takes a bit of time, but is something that is very well worth learning. 

1.51 *** [Computer] Repeat all of Problem 1.50 but using the initial value cp 0 — tt/2. 




CHAPTER 


Projectiles and 
Charged Particles 


In this chapter, I present two topics: the motion of projectiles subject to the forces of 
gravity and air resistance, and the motion of charged particles in uniform magnetic 
fields. Both problems lend themselves to solution using Newton’s laws in Cartesian 
coordinates, and both allow us to review and introduce some important mathematics. 
Above all, both are problems of great practical interest. 


2.1 Air Resistance 


Most introductory physics courses spend some time studying the motion of projectiles, 
but they almost always ignore air resistance. In many problems this is an excellent 
approximation; in others, air resistance is obviously important, and we need to know 
how to account for it. More generally, whether or not air resistance is significant, we 
need some way to estimate how important it really is. 

Let us begin by surveying some of the basic properties of the resistive force, or 
drag, f of the air, or other medium, through which an object is moving. (I shall 
generally speak of “air resistance” since air is the medium through which most 
projectiles move, but the same considerations apply to other gases and often to liquids 
as well.) The most obvious fact about air resistance, well known to anyone who rides 
a bicycle, is that it depends on the speed, v, of the object concerned. In addition, for 
many objects, the direction of the force due to motion through the air is opposite to 
the velocity v. For certain objects, such as a nonrotating sphere, this is exactly true, 
and for many it is a good approximation. You should, however, be aware that there 
are situations where it is certainly not true: The force of the air on an airplane wing 
has a large sideways component, called the lift, without which no airplanes could fly. 
Nevertheless, I shall assume that f and v point in opposite directions; that is, I shall 
consider only objects for which the sideways force is zero, or at least small enough 
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f = -f(v)y | 

w = mg 


Figure 2.1 A projectile is subject to two forces, the force 
of gravity, w = mg, and the drag force of air resistance, 

f = -/(u)v. 


to be neglected. The situation is illustrated in Figure 2.1 and is summed up in the 
equation 


f = -/(u)v, (2.1) 

where v = v/1 v | denotes the unit vector in the direction of v, and f(v) is the magnitude 
off. 

The function f(v) that gives the magnitude of the air resistance varies with v in 
a complicated way, especially as the object’s speed approaches the speed of sound. 
However, at lower speeds it is often a good approximation to write 1 

f(v) = bv + cv 2 = / lin + / quad (2.2) 

where / lin and / quad stand for the linear and quadratic terms respectively, 

/ lin = bv and / quad = cv 2 . (2.3) 

The physical origins of these two terms are quite different: The linear term, / lin , arises 
from the viscous drag of the medium and is generally proportional to the viscosity of 
the medium and the linear size of the projectile (Problem 2.2). The quadratic term, 
/quad’ arises from the projectile’s having to accelerate the mass of air with which it is 
continually colliding; / quad is proportional to the density of the medium and the cross- 
sectional area of the projectile (Problem 2.4). In particular, for a spherical projectile 
(a cannonball, a baseball, or a drop of rain), the coefficients b and c in (2.2) have the 
form 


b - 16D and c = yD 2 (2.4) 

where D denotes the diameter of the sphere and the coefficients ft and y depend 
on the nature of the medium. For a spherical projectile in air at STP, they have the 
approximate values 

p = 1.6 x 1CT 4 N-s/m 2 (2.5) 


1 Mathematically, Equation (2.2) is, in a sense, obvious. Any reasonable function is expected to 
have a Taylor series expansion, f = a + bv + cv 2 + • • •. For low enough v, the first three terms 
should give a good approximation, and, since / = 0 when v = 0 the constant term, a, has to be zero. 
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and 


y = 0.25 N-s 2 /m 4 . 


(2.6) 


(For calculation of these two constants, see Problems 2.2 and 2.4.) You need to 
remember that these values are valid only for a sphere moving through air at STP. 
Nevertheless, they give at least a rough idea of the importance of the drag force even 
for nonspherical bodies moving through different gases at any normal temperatures 
and pressures. 

It often happens that we can neglect one of the terms in (2.2) compared to the other, 
and this simplifies the task of solving Newton’s second law. To decide whether this 
does happen in a given problem, and which term to neglect, we need to compare the 
sizes of the two terms: 


4^ = £ ^ = z d / i 6xl0 , s \ 
/lin bv p V m 2 / 


(2.7) 


if we use the values (2.5) and (2.6) for a sphere in air. In a given problem, we have 
only to substitute the values of D and v into this equation to find out if one of the 
terms can be neglected, as the following example illustrates. 


example 2.1 A Baseball and Some Drops of Liquid 

Assess the relative importance of the linear and quadratic drags on a baseball 
of diameter D = 7 cm, traveling at a modest v = 5 m/s. Do the same for a drop 
of rain (D = 1 mm and v = 0.6 m/s) and for a tiny droplet of oil used in the 
Millikan oildrop experiment (D = 1.5 jim and v = 5 x 10 -5 m/s). 

When we substitute the numbers for the baseball into (2.7) (remembering to 
convert the diameter to meters), we get 



[baseball]. 


(2.8) 


For this baseball, the linear term is clearly negligible and we need consider only 
the quadratic drag. If the ball is traveling faster, the ratio / qua d//iin is even greater. 
At slower speeds the ratio is less dramatic, but even at 1 m/s the ratio is 100. In 
fact if v is small enough that the linear term is comparable to the quadratic, both 
terms are so small as to be negligible. Thus, for baseballs and similar objects, it 
is almost always safe to neglect / lin and take the drag force to be 


f = -cv 2 \. 


(2.9) 


For the raindrop, the numbers give 



[raindrop]. 


( 2 . 10 ) 


Thus for this raindrop the two terms are comparable and neither can be ne¬ 
glected — which makes solving for the motion more difficult. If the drop were 
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a lot larger or were traveling much faster, then the linear term would be negligi- 
j ble; and if the drop were much smaller or were traveling much slower, then the 
j quadratic term would be negligible. But in general, with raindrops and similar 
| objects, we are going to have to take both / lin and / quad into account, 
j For the oildrop in the Millikan experiment the numbers give 

1 ~ 10 -7 [Millikan oildrop]. (2.11) 

I /lin 

In this case, the quadratic term is totally negligible, and we can take 

f = -bv\ = -b\, (2.12) 

| where the second, very compact form follows because, of course, v\ = \. 

The moral of this example is clear: First, there are objects for which the drag force 
is dominantly linear, and the quadratic force can be neglected — notably, very small 
liquid drops in air, but also slightly larger objects in a very viscous fluid, such as a ball 
bearing moving through molasses. On the other hand, for most projectiles, such as golf 
balls, cannonballs, and even a human in free fall, the dominant drag force is quadratic, 
and we can neglect the linear term. This situation is a little unlucky because the linear 
problem is much easier to solve than the quadratic. In the following two sections, 
I shall discuss the linear case, precisely because it is the easier one. Nevertheless, it 
does have practical applications, and the mathematics used to solve it is widely used in 
many fields. In Section 2.4,1 shall take up the harder but more usual case of quadratic 
drag. 

To conclude this introductory section, I should mention the Reynolds number, an 
important parameter that features prominently in more advanced treatments of motion 
in fluids. As already mentioned, the linear drag / lin can be related to the viscosity of the 
fluid through which our projectile is moving, and the quadratic term / quad is similarly 
related to the inertia (and hence density) of the fluid. Thus one can relate the ratio 
/quad//tin to the fundamental parameters r], the viscosity, and q, the density, of the 
fluid (see Problem 2.3). The result is that the ratio / qua d//iin is of roughly the same 
order of magnitude as the dimensionless number R = Dvq/t], called the Reynolds 
number. Thus a compact and general way to summarize the foregoing discussion is 
to say that the quadratic drag / quad is dominant when the Reynolds number R is large, 
whereas the linear drag dominates when R is small. 


2.2 Linear Air Resistance _ 

Let us consider first a projectile for which the quadratic drag force is negligible, so 
that the force of air resistance is given by (2.12). We shall see directly that, because 
the drag force is linear in v, the equations of motion are very simple to solve. The two 
forces on the projectile are the weight w = mg and the drag force f = — bx, as shown 
in Figure 2.2. Thus the second law, mr = F, reads 


mr = mg — b\. 


(2.13) 
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( = -bv | 

w = mg 


Figure 2.2 The two forces on a projectile for which the 
force of air resistance is linear in the velocity, f = — b\. 


An interesting feature of this form is that, because neither of the forces depends on r, 
the equation of motion does not involve r itself (only the first and second derivatives 
of r). In fact, we can rewrite r as v, and (2.13) becomes 

my = mg —by, (2.14) 

a first-order differential equation for v. This simplification comes about because the 
forces depend only on v and not r. It means we have to solve only a first-order 
differential equation for v and then integrate v to find r. 

Perhaps the most important simplifying feature of linear drag is that the equation 
of motion separates into components especially easily. For instance, with x measured 
to the right and y vertically downward, (2.14) resolves into 

mv x — —bv x (2.15) 


and 


mv y = mg - bv y . (2.16) 

That is, we have two separate equations, one for v x and one for v y ; the equation for v x 
does not involve v y and vice versa. It is important to recognize that this happened only 
because the drag force was linear in v. For instance, if the drag force were quadratic, 

f = — cv 2 \ = -cv\ = -Cyjv^ + v 2 v, (2.17) 

then in (2.14) we would have to replace the term —by with (2.17). In place of the two 
equations (2.15) and (2.16), we would have 

mv x = — (? / v} + u 2 v x 1 

V rr—7 (2 - 18) 

mv y = mg — cjv* + irf v y . J 

Here, each equation involves both of the variables v x and v y . These two coupled 
differential equations are much harder to solve than the uncoupled equations of the 
linear case. 

Because they are uncoupled, we can solve each equation for linear drag separately 
and then put the two solutions together. Further, each equation defines a problem that 
is interesting in its own right. Equation (2.15) is the equation of motion for an object 
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v 


t=-bv 





Figure 2.3 A cart moves on a horizontal frictionless track 
in a medium that produces a linear drag force. 


(a cart with frictionless wheels, for instance) coasting horizontally in a medium that 
causes linear drag. Equation (2.16) describes an object (a tiny oil droplet for instance) 
that is falling vertically with linear air resistance. I shall solve these two separate 
problems in turn. 


Horizontal Motion with Linear Drag 

Consider an object such as the cart in Figure 2.3 coasting horizontally in a linearly 
resistive medium. I shall assume that at t = 0 the cart is at x = 0 with velocity v x = v xo . 
The only force on the cart is the drag f = — b\, thus the cart inevitably slows down. 
The rate of slowing is determined by (2.15), which has the general form 

v x = -kv xi (2.19) 

where k is my temporary abbreviation for k = b/m. This is a first-order differential 
equation for v x , whose general solution must contain exactly one arbitrary constant. 
The equation states that the derivative of v x is equal to —k times v x itself, and the only 
function with this property is the exponential function 

v x (t) = Ae~ kt (2.20) 

which satisfies (2.19) for any value of the constant A (Problems 1.24 and 1.25). Since 
this solution contains one arbitrary constant, it is the general solution of our first- 
order equation; that is, any solution must have this form. In our case, we know that 
1^(0) = v x0 , so that A = v x0 , and we conclude that 

v x {t ) = v XQ e~ kt = v xo e t/T , (2.21) 

where I have introduced the convenient parameter 

r = 1 jk — m/b [for linear drag]. (2.22) 

We see that our cart slows down exponentially, as shown in Figure 2.4(a). The 
parameter r has the dimensions of time (as you should check), and you can see from 

(2.21) that when t = r, the velocity is 1/e of its initial value; that is, t is the “ l/e ” 
time for the exponentially decreasing velocity. As t oo, the velocity approaches 
zero. 

To find the position as a function of time, we have only to integrate the velocity 

(2.21) . Integrations of this kind can be done using the definite or indefinite integral. 
The definite integral has the advantage that, it automatically takes care of the constant 
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Figure 2.4 (a) The velocity v x as a function of time, t, for a cart 

moving horizontally with a linear resistive force. As t oo, v x 
approaches zero exponentially, (b) The position x as a function of 
t for the same cart. As t oo, x -> x <*, — v xo r. 


of integration: Since v x = dx/dt. 


i: 


v x (t') dt' = x(t) — x(0). 


(Notice that I have named the “dummy” variable of integration t' to avoid confusion 
with the upper limit t.) Therefore 


x(t)=x(0)+ I v xo e 1 ! x dt' 

Jo 

= 0+ h^^ 7 l 

= x*, (1 - e~ t/r ) . (2.23) 

In the second line, I have used our assumption that x = 0 when t = 0. And in the last, 
I have introduced the parameter 


*cc = v xo r > (22A) 

which is the limit of x(t) as t oo. We conclude that, as the cart slows down, its 
position approaches x^ asymptotically, as shown in Figure 2.4(b). 


Vertical Motion with Linear Drag 

Let us next consider a projectile that is subject to linear air resistance and is thrown 
vertically downward. The two forces on the projectile are gravity and air resistance, as 
shown in Figure 2.5. If we measure y vertically down, the only interesting component 
of the equation of motion is the y component, which reads 

mv y — mg - bv y . (2.25) 

With the velocity downward (v y > 0), the retarding force is upward, while the force 
of gravity is downward. If v y is small, the force of gravity is more important than 
the drag force, and the falling object accelerates in its downward motion. This will 
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f = -b\ 



w = mg 


Figure 2.5 The forces on a projectile that is thrown ver¬ 
tically down, subject to linear air resistance. 


continue until the drag force balances the weight. The speed at which this balance 
occurs is easily found by setting (2.25) equal to zero, to give v y = mg/b or 

Vy = ”ter 

where I have defined the terminal speed 

u ter = — [for linear drag]. (2.26) 

b 

The terminal speed is the speed at which our projectile will eventually fall, if given 
the time to do so. Since it depends on m and b, it is different for different bodies. For 
example, if two objects have the same shape and size (b the same for both), the heavier 
object (m larger) will have the higher terminal speed, just as you would expect. Since 
u ter is inversely proportional to the coefficient b of air resistance, we can view u ter as 
an inverse measure of the importance of air resistance — the larger the air resistance, 
the smaller u ter , again just as you would expect. 


example 2.2 Terminal Speed of Small Liquid Drops 

Find the terminal speed of a tiny oildrop in the Millikan oildrop experiment 
(diameter D = 1.5 fim and density q = 840 kg/m 3 ). Do the same for a small 
drop of mist with diameter D = 0.2 mm. 

From Example 2.1 we know that the linear drag is dominant for these objects, 
so the terminal speed is given by (2.26). According to (2.4), b = /3D where 
P = 1.6 x 10 -4 (in SI units). The mass of the drop is m = qjtD 3 /6. Thus (2.26) 
becomes 


_ QnD 2 g 


6/3 


[for linear drag]. 


(2.27) 
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This interesting result shows that, for a given density, the terminal speed is 
proportional to D 2 . This implies that, once air resistance has become important, 
a large sphere will fall faster than a small sphere of the same density. 2 
Putting in the numbers, we find for the oildrop 


l) ter 


(840) X 7T X (1.5 X 10~ 6 ) 2 X (9.8) 
6 x (1.6 x 10- 4 ) 


= 6.1 x 10~ 5 m/s 


[oildrop]. 


In the Millikan oildrop experiment, the oildrops fall exceedingly slowly, so their 
speed can be measured by simply watching them through a microscope. 
Putting in the numbers for the drop of mist, we find similarly that 


u ter =1.3m/s [drop of mist]. (2.28) 


This speed is representative for a fine drizzle. For a larger raindrop, the terminal 
speed would be appreciably larger, but with a larger (and hence also faster) drop, 
the quadratic drag would need to be included in the calculation to get a reliable 
value for u ter . 


So far, we have discussed the terminal speed of a projectile (moving vertically), 
but we must now discuss how the projectile approaches that speed. This is determined 
by the equation of motion (2.25) which we can rewrite as 

mv y = -b(v y - v ter ). (2.29) 

(Remember that u ter = mg/b.) This differential equation can be solved in several ways. 
(For one alternative see Problem 2.9.) Perhaps the simplest is to note that it is almost 
the same as Equation (2.15) for the horizontal motion, except that on the right we now 
have (v y — u ter ) instead of v x . The solution for the horizontal case was the exponential 
function (2.20). The trick to solving our new vertical equation (2.29) is to introduce 
the new variable u — {v y — u ter ), which satisfies mu = —bu (because u ter is constant). 
Since this is exactly the same as Equation (2.15) for the horizontal motion, the solution 
for u is the same exponential, u = Ae~^ T . [Remember that the constant k in (2.20) 
became k = 1/r.] Therefore, 


v y - D ter = Ae f/T . 

When t = 0, v y — v y0 , so A = v y0 — v ter and our final solution for v y as a function of 
t is 

Vy(t) = Uter + ( v yo ~ ( 2 - 30 ) 

= v yo e~ t/r + u ter (1 - e~ t/r ) . (2.31) 


2 We are here assuming that the drag force is linear, but the same qualitative conclusion follows 
for a quadratic drag force. (Problem 2.24.) 
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Figure 2.6 When an object is dropped in a medium with 
linear resistance, v y approaches its terminal value u ter as 
shown. 


This second expression gives v y (t) as the sum of two terms: The first is equal to v yo 
when t = 0, but fades away to zero as t increases; the second is equal to zero when 
t = 0, but approaches u ter as t oo. In particular, as t -> oo, 

v y {t) -* u ter (2.32) 

just as we anticipated. 

Let us examine the result (2.31) in a little more detail for the case that v yo = 0; 
that is, the projectile is dropped from rest. In this case (2.31) reads 

v,(t) = iv(l- e -"'). (2.33) 

This result is plotted in Figure 2.6, where we see that v y starts out from 0 and 
approaches the terminal speed, v y -»• u ter , asymptotically as t -» oo. The significance 
of the time r for a falling body is easily read off from (2.33). When t = r, we see that 

v y = D ter (l - e~ l ) = 0.63u ter . 


That is, in a time r, the object reaches 63% of the terminal speed. Similar calculations 
give the following results: 

time percent 

t of u ter 

0 0 

r 63% 

2r 86% 

3r 95% 

Of course, the object’s speed never actually reaches u ter , but r is a good measure of 
how fast the speed approaches u ter . In particular, when t = 3r the speed is 95% of u ter , 
and for many purposes we can say that after a time 3t the speed is essentially equal 
to u ter . 
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example 2.3 Characteristic Time for Two Liquid Drops 

Find the characteristic times, r, for the oildrop and drop of mist in Example 2.2. 

The characteristic time r was defined in (2.22) as r = m/b, and u ter was 
defined in (2.26) as v ter = mg/b. Thus we have the useful relation 

Uter = gr. (2.34) 

Notice that this relation lets us interpret u ter as the speed a falling object would 
acquire in a time r, if it had a constant acceleration equal to g. Also note that, 
like u ter , the time t is an inverse indicator of the importance of air resistance: 
When the coefficient b of air resistance is small, both u ter and r are large; when 
b is large, both u ter and r are small. 

For our present purposes, the importance of (2.34) is that, since we have 
already found the terminal velocities of the two drops, we can immediately find 
the values of t. For the Millikan oildrop, we found that u ter = 6.1 x 10~ 5 m/s, 
therefore 

x _ }her _ 6 1 x 10 5 = 6 2 x 1(T 6 s [oildrop], 
g 9.8 

After falling for just 20 microseconds, this oildrop will have acquired 95% of 
its terminal speed. For almost every purpose, the oildrop always travels at its 
terminal speed. 

For the drop of mist of Example 2.2, the terminal speed was w ter = 1.3 m/s 
and so x — u ter /g ~ 0.13 s. After about 0.4 s, the drop will have acquired 95% 
of its terminal speed. 


Whether or not our falling object starts from rest, we can find its position y as a 
function of time by integrating the known form (2.30) of v y , 

v y (t) - u ter + ( v yo - v ter )e- t/r . 

Assuming that the projectile’s initial position is y = 0, it immediately follows that 



= u ter ? + ( v yo - u ter )r (l - e t/r ). (2.35) 

This equation for y(t) can now be combined with Equation (2.23) for x(t) to give 
us the orbit of any projectile, moving both horizontally and vertically, in a linear 
medium. 
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2.3 Trajectory and Range in a Linear Medium 


We saw at the begining of the last section that the equation of motion for a projectile 
moving in any direction resolves into two separate equations, one for the horizontal 
and one for the vertical motion [Equations (2.15) and (2.16)]. We have solved each 
of these separate equations in (2.23) and (2.35), and we can now put these solutions 
together to give the trajectory of an arbitrary projectile moving in any direction. In 
this discussion it is marginally more convenient to measure y vertically upward, in 
which case we must reverse the sign of u ter . (Make sure you understand this point.) 
Thus the two equations of the orbit become 


x(t ) = v xo r (l — e t/T ) 

y(t) = (v yo + t’ ter )r (1 - c ?/r ) - v tcr t. 


(2.36) 


You can eliminate t from these two equations by solving the first for t and then 
substituting into the second. (See Problem 2.17.) The result is the equation for the 
trajectory: 


y = y ° + teI * + V In (l - —) • (2.37) 

V xo \ v xo r ) 

This equation is probably too complicated to be especially illuminating, but I have 
plotted it as the solid curve in Figure 2.7, with the help of which you can understand 
some of the features of (2.37). For example, if you look at the second term on the right 
of (2.37), you will see that as * v xo r the argument of the log function approaches 
zero; therefore, the log term and hence y both approach -oo. That is, the trajectory 
has a vertical asymptote at x = v xo r, as you can see in the picture. I leave it as an 
exercise (Problem 2.19) for you to check that if air resistance is switched off (c ter and 
r both approach infinity), the trajectory defined by (2.37) does indeed approach the 
dashed trajectory corresponding to zero air resistance. 


Horizontal Range 

A standard (and quite interesting) problem in elementary physics courses is to show 
that the horizontal range R of a projectile (subject to no air resistance of course) is 

i?vac = — - Q - >Q - [no air resistance] (2.38) 

S 

where /? vac stands for the range in a vacuum. Let us see how this result is modified by 
air resistance. 

The range R is the value of x when y as given by (2.37) is zero. Thus R is the 
solution of the equation 


— R + c tcr T In 



= 0 . 


Vyo + a t 


(2.39) 
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Figure 2.7 The trajectory of a projectile subject to a linear drag 
force (solid curve) and the corresponding trajectory in a vacuum 
(dashed curve). At first the two curves are very similar, but as t 
increases, air resistance slows the projectile and pulls its trajec¬ 
tory down, with a vertical asymptote at x = v xo x. The horizontal 
range of the projectile is labeled R, and the corresponding range 
in vacuum /? vac . 


This is a transcendental equation and cannot be solved analytically, that is, in terms of 
well known, elementary functions such as logs, or sines and cosines. For a given choice 
of parameters, it can be solved numerically with a computer (Problem 2.22), but this 
approach usually gives one little sense of how the solution depends on the parameters. 
Often a good alternative is to find some approximation that allows an approximate 
analytic solution. (Before the advent of computers, this was often the only way to find 
out what happens.) In the present case, it is often clear that the effects of air resistance 
should be small. This means that both u ter and r are large and the second term in the 
argument of the log function is small (since it has x in its denominator). This suggests 
that we expand the log in a Taylor series (see Problem 2.18): 

ln(l - e) = - (e + ±€ 2 + + • • •) . (2.40) 

We can use this expansion for the log term in (2.39), and, provided r is large enough, 
we can surely neglect the terms beyond e 3 . This gives the equation 



(2.41) 


This equation can be quickly tidied up. First, the second term in the first bracket 
cancels the first term in the second. Next, every term contains a factor of R. This 
implies that one solution is R = 0, which is correct — the height y is zero when x = 0. 
Nevertheless, this is not the solution we are interested in, and we can divide out the 
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common factor of R. A little rearrangement (and replacement of u ter /r by g) lets us 
rewite the equation as 


R - 


8 


2 

3v xo r 


R 2 . 


(2.42) 


This may seem a perverse way to write a quadratic equation for R , but it leads us 
quickly to the desired approximate solution. The point is that the second term on the 
right is very small. (In the numerator R is certainly no more than R yac and we are 
assuming that r in the denominator is very large.) Therefore, as a first approximation 
we get 


2v x( ,v vn 

R % *° y ° = R yac . (2.43) 

8 

This is just what we expected: For low air resistance, the range is close to R yac . But 
with the help of (2.42) we can now get a second, better approximation. The last term of 
(2.42) is the required correction to i? vac ; because it is already small, we would certainly 
be satisfied with an approximate value for this correction. Thus, in evaluating the last 
term of (2.42), we can replace R with the approximate value R ~ R yac , and we find 
as our second approximation [remember that the first term in (2.42) is just /? vac ] 

* « R„C - 

3v xo r 

a44) 

(To get the second line, I replaced the second R yac in the previous line by 2v xo v yo /g 
and rg by u ter .) Notice that the correction for air resistance always makes R smaller 
than R yac , as one would expect. Notice also that the correction depends only on the 
ratio v y0 /v ter . More generally, it is easy to see (Problem 2.32) that the importance of 
air resistance is indicated by the ratio v/v ter of the projectile’s speed to the terminal 
speed. If v/v ter <<C 1 throughout the flight, the effect of air resistance is very small; 
if v/v ter is around 1 or more, air resistance is almost certainly important [and the 
approximation (2.44) is certainly no good]. 


example 2.4 Range of Small Metal Pellets 

I flick a tiny metal pellet with diameter d = 0.2 mm and v = 1 m/s at 45°. Find 
its horizontal range assuming the pellet is gold (density q ~ 16 g/cm 3 ). What if 
it is aluminum (density q ~ 2.7 g/cm 3 )? 

In the absence of air resistance, both pellets would have the same range, 

2 v xo v yo 

R yac = y = 10.2 cm. 

8 
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For gold, Equation (2.27) gives (as you can check) c ter ~ 21 m/s. Thus the 
correction term in (2.44) is 


4»V_4 0.71 , 

3 v ter “ 3 X 21 


That is, air resistance reduces the range by 5% to about 9.7 cm. The density 
of aluminum is about 1/6 times that of gold. Therefore the terminal speed is 
one sixth as big, and the correction for aluminum is 6 times greater or about 
30%, giving a range of about 7 cm. For the gold pellet the correction for air 
resistance is quite small and could perhaps be neglected; for the aluminum pellet, 
the correction is still small, but is certainly not negligible. 


2.4 Quadratic Air Resistance 


In the last two sections we have developed a rather complete theory of projectiles 
subject to a linear drag force, f = —by. While we can find examples of projectiles for 
which the drag is linear (notably very small objects, such as the Millikan oildrop), for 
most of the more obvious examples of projectiles (baseballs, footballs, cannonballs, 
and the like) it is a far better approximation to say that the drag is pure quadratic, 
f = —cv 2 \. We must, therefore, develop a corresponding theory for a quadratic drag 
force. On the face of it, the two theories are not so very different. In either case we 
have to solve the differential equation 

my = mg + f, (2.45) 

and in both cases this is a first-order differential equation for the velocity v, with f 
depending in a relatively simple way on v. There is, however, an important difference. 
In the linear case (f = —by). Equation (2.45) is a linear differential equation, inas¬ 
much as the terms that involve v are all linear in v or its derivatives. In the quadratic 
case. Equation (2.45) is, of course, nonlinear. And it turns out that the mathemati¬ 
cal theory of nonlinear differential equations is significantly more complicated than 
the linear theory. As a practical matter, we shall find that for the case of a general 
projectile, moving in both the x and y directions, Equation (2.45) cannot be solved 
in terms of elementary functions when the drag is quadratic. More generally, we 
shall see in Chapter 12 that for more complicated systems, nonlinearity can lead to 
the astonishing phenomenon of chaos, although this does not happen in the present 
case. 

In this section, I shall start with the same two special cases discussed in Section 2.2, 
a body that is constrained to move horizontally, such as a railroad car on a horizontal 
track, and a body that moves vertically, such as a stone dropped from a window (both 
now with quadratic drag forces). We shall find that in these two especially simple cases 
the differential equation (2.45) can be solved by elementary means, and the solutions 
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introduce some important techniques and interesting results. I shall then discuss briefly 
the general case (motion in both the horizontal and vertical directions), which can be 
solved only numerically. 


Horizontal Motion with Quadratic Drag 

Let us consider a body moving horizontally (in the positive x direction), subject to 
a quadratic drag and no other forces. For example, you could imagine a cycle racer, 
who has crossed the finishing line and is coasting to a stop under the influence of air 
resistance. To the extent that the cycle is well lubricated and tires well inflated, we 
can ignore ordinary friction, 3 and, except at very low speeds, air resistance is purely 
quadratic. The x component of the equation of motion is therefore (I’ll abbreviate v x 
to v ) 


dv 

m — = —cv 
dt 


2 


(2.46) 


If we divide by v 2 and multiply by dt, we get an equation in which only the variable 
v appears on the left and only t on the right: 4 


m— = —cdt. (2.47) 

v 2 

This trick — of rearranging a differential equation so that only one variable appears 
on the left and only the other on the right — is called separation of variables. When 
it is possible, separation of variables is often the simplest way to solve a first-order 
differential equation, since the solution can be found by simple integration of both 
sides. 

Integrating Equation (2.47) we find 




Jo 


where v 0 is the initial velocity at t = 0. Notice that I have written both sides as definite 
integrals, with the appropriate limits, so that I shan’t have to worry about any constants 
of integration. I have also renamed the variables of integration as v' and t' to avoid 


3 As I shall discuss shortly, when the cyclist slows down to a stop, air resistance becomes smaller, 
and eventually friction becomes the dominant force. Nevertheless, at speeds around 10 mph or more, 
it is a fair approximation to ignore everything but the quadratic air resistance. 

4 In passing from (2.46) to (2.47), I have treated the derivative dv/dt as if it were the quotient of 
two separate numbers, dv and dt. As you are certainly aware this cavalier proceeding is not strictly 
correct. Nevertheless, it can be justified in two ways. First, in the theory of differentials, it is in fact 
true that dv and dt are defined as separate numbers (differentials), such that their quotient is the 
derivative dv/dt. Fortunately, it is quite unnecessary to know about this theory. As physicists we 
know that dv/dt is the limit of Av/At, as both Av and At become small, and I shall take the view 
that dv is just shorthand for Av (and likewise dt for At), with the understanding that it has been 
taken small enough that the quotient dv/dt is within my desired accuracy of the true derivative. With 
this understanding, (2.47), with dv on one side and dt on the other, makes perfectly good sense. 
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confusion with the upper limits v and t. Both of these integrals are easily evaluated, 
and we find 


m 



= —ct 


(2.48) 


or, solving for v, 


v(t) = 


1 + cv 0 t/m 


v Q 

1 + t/T 


(2.49) 


where I have introduced the abbreviation x for the combination of constants 


x — — [for quadratic drag]. (2.50) 

cv Q 

As you can easily check, r is a time, with the significance that when t = r the velocity 
is v = u 0 /2. Notice that this parameter r is different from the r introduced in (2.22) for 
motion subject to linear air resistance; nevertheless, both parameters have the same 
general significance as indicators of the time for air resistance to slow the motion 
appreciably. 

To find the bicycle’s position x, we have only to integrate v to give (as you should 
check) 

x(t)=x 0 + / v(t')dt' 

Jo 

= v 0 r In (1 + t/x) , (2.51) 

if we take the initial position x 0 to be zero. Figure 2.8 shows our results for v and 
x as functions of t. It is interesting to compare these graphs with the corresponding 
graphs of Figure 2.4 for a body coasting horizontally but subject to a linear resistance. 
Superficially, the two graphs for the velocity look similar. In particular, both go to 
zero as t -> oo. But in the linear case v goes to zero exponentially, whereas in the 
quadratic case it does so only very slowly, like 1 /t. This difference in the behavior 
of v manifests itself quite dramatically in the behavior of x. In the linear case, we 



Figure 2.8 The motion of a body, such as a bicycle, coasting 
horizontally and subject to a quadratic air resistance, (a) The 
velocity is given by (2.49) and goes to zero like l/t as t —> oo. 
(b) The position is given by (2.51) and goes to infinity as t —> oo. 
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saw that x approaches a finite limit as t —»• oo, but it is clear from (2.51) that in the 
quadratic case x increases without limit as t oo. 

The striking difference in the behavior of x for quadratic and linear drags is easy 
to understand qualitatively. In the quadratic case, the drag is proportional to v 2 . Thus 
as v gets small, the drag gets very small — so small that it fails to bring the bicyle 
to rest at any finite value of x. This unexpected behavior serves to highlight that a 
drag force that is proportional to v 2 at all speeds is unrealistic. Although the linear 
drag and ordinary friction are very small, nevertheless as v —> 0 they must eventually 
become more important than the v 2 term and cannot be ignored. In particular, one or 
another of these two terms (friction in the case of a bicycle) ensures that no real body 
can coast on to infinity! 


Vertical Motion with Quadratic Drag 

The case that an object moves vertically with a quadratic drag force can be solved in 
much the same way as the horizontal case. Consider a baseball that is dropped from a 
window in a high tower. If we measure the coordinate y vertically down, the equation 
of motion is (I’ll abbreviate v y to v now) 

mi) = mg — cv 2 . (2.52) 


Before we solve this equation, let us consider the ball’s terminal speed, the speed at 
which the two terms on the right of (2.52) just balance. Evidently this must satisfy 
cv 2 = mg, whose solution is 


c ter = 



(2.53) 


For any given object (given m, g, and c), this lets us calculate the terminal speed. For 
example, for a baseball it gives (as we shall see in a moment) u ter ~ 35 m/s, or nearly 
80 miles per hour. 

We can tidy the equation of motion (2.52) a little by using (2.53) to replace c by 
mg/v 2 r and canceling the factors of m: 

i = (2.54) 

This can be solved by separation of variables, just as in the case of horizontal 
motion: First we can rewrite it as 


dv 

1 — u 2 /u t g r 


= gdt. 


(2.55) 


This is the desired separated form (only v on the left and only t on the right) and 
we can simply integrate both sides. 5 Assuming the ball starts from rest, the limits of 


5 Notice that in fact any one-dimensional problem where the net force depends only on the 
velocity can be solved by separation of variables, since the equation mi) — F(v) can always be 
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integration are 0 and v on the left and 0 and t on the right, and we find (as you should 
verify — Problem 2.35) 

— arctanh ^ = 1 (2.56) 

where “arctanh” denotes the inverse hyperbolic tangent. This particular integral can be 
evaluated alternatively in terms of the natural log function (Problem 2.37). However, 
the hyperbolic functions, sinh, cosh, and tanh, and their inverses arcsinh, arccosh, and 
arctanh, come up so often in all branches of physics that you really should learn to use 
them. If you have not had much exposure to them, you might want to look at Problems 
2.33 and 2.34, and study graphs of these functions. 

Equation (2.56) can be solved for v to give 

v = u ter tanh ^ • (2.57) 

To find the position y, we just integrate v to give 



While both of these two formulas can be cleaned up a little (see Problem 2.35), they 
are already sufficient to work the following example. 


example 2.5 A Baseball Dropped from a High Tower 

Find the terminal speed of a baseball (mass m — 0.15 kg and diameter D = 7 
cm). Make plots of its velocity and position for the first six seconds after it is 
dropped from a tall tower. 

The terminal speed is given by (2.53), with the coefficient of air resistance c 
given by (2.4) as c = yD 2 where y = 0.25 N-s 2 /m 4 . Therefore 


rmT = / (0.15kg) X (9.8m/s 2 ) ^ ^ 

V yD 2 Y (0.25N-s 2 /m 4 ) x (0.07m) 2 


(2.59) 


or nearly 80 miles per hour. It is interesting to note that fast baseball pitchers 
can pitch a ball considerably faster than u ter . Under these conditions, the drag 
force is actually greater than the ball’s weight! 

The plots of v and y can be made by hand, but are, of course, much easier with 
the help of computer software such as Mathcad or Mathematica that can make the 
plots for you. Whatever method we choose, the results are as shown in Figure 2.9, 
where the solid curves show the actual velocity and position while the dashed 
curves are the corresponding values in a vacuum. The actual velocity levels out, 


written as m dv/F{v) — dt. Of course there is no assurance that this can be integrated analytically 
if F(v ) is too complicated, but it does guarantee a straightforward numerical solution at worst. See 
Problem 2.7. 
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Figure 2.9 The motion of a baseball dropped from the top of a high tower (solid 
curves). The corresponding motion in a vacuum is shown with long dashes. 

(a) The actual velocity approaches the ball’s terminal velocity u ter = 35 m/s as 
t -» oo. (b) The graph of position against time falls further and further behind 
the corresponding vacuum graph. When t = 6 s, the baseball has dropped about 
130 meters; in a vacuum, it would have dropped about 180 meters. 

approaching the terminal value u ter = 35 m/s as t ^ oo, whereas the velocity 
in a vacuum would increase without limit. Initially, the position increases just 
as it would in a vacuum (that is, y = \gt 2 ), but falls behind as v increases and 
the air resistance becomes more important. Eventually, y approaches a straight 
line of the form y = v teT t+ const. (See Problem 2.35.) 


Quadratic Drag with Horizontal and Vertical Motion 

The equation of motion for a projectile subject to quadratic drag, 
mr = mg — cv 2 \ 

= mg — cvv, (2.60) 

resolves into its horizontal and vertical components (with y measured vertically 
upward) to give 

mv x = — c -x v ? + v 2 v x ) 

V nr~ —2 (2 - 61) 

mv y - -mg - cjv£ + vf v y . ) 

These are two differential equations for the two unknown functions v x (t) and v y (t), 
but each equation involves both v x and v y . In particular, neither equation is the same 
as for an object that moves only in the x direction or only in the y direction. This 
means that we cannot solve these two equations by simply pasting together our two 
separate solutions for horizontal and vertical motion. Worse still, it turns out that 
the two equations (2.61) cannot be solved analytically at all. The only way to solve 
them is numerically, which we can only do for specified numerical initial conditions 
(that is, specified values of the initial position and velocity). This means that we 
cannot find the general solution; all we can do numerically is to find the particular 
solution corresponding to any chosen initial conditions. Before I discuss some general 
properties of the solutions of (2.61), let us work out one such numerical solution. 
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example 2.6 Trajectory of a Baseball 

The baseball of Example 2.5 is now thrown with velocity 30 m/s (about 70 mi/h) 
at 50° above the horizontal from a high cliff. Find its trajectory for the first eight 
seconds of flight and compare with the corresponding trajectory in a vacuum. If 
the same baseball was thrown with the same initial velocity on horizontal ground 
how far would it travel before landing? That is, what is its horizontal range? 

We have to solve the two coupled differential equations (2.61) with the initial 
conditions 

v xo — v o cos 0 = 19.3 m/s and v yo = v 0 sin 9 = 23.0 m/s 

and x 0 = y Q = 0 (if we put the origin at the point from which the ball is thrown). 
This can be done with systems such as Mathematica, Matlab, or Maple, or with 
programming languages such as “C” or Fortran. Figure 2.10 shows the resulting 
trajectory, found using the function “NDSolve” in Mathematica. 

Several features of Figure 2.10 deserve comment. Obviously the effect of 
air resistance is to lower the trajectory, as compared to the vacuum trajectory 
(shown dashed). For example, we see that in a vacuum the high point of the 
trajectory occurs at t « 2.3 s and is about 27 m above the starting point; with air 
resistance, the high point comes just before t = 2.0 s and is at about 21 m. In 
a vacuum, the ball would continue to move indefinitely in the x direction. The 



Figure 2.10 Trajectory of a baseball thrown off a cliff and subject 
to quadratic air resistance (solid curve). The initial velocity is 30 
m/s at 50° above the horizontal; the terminal speed is 35 m/s. The 
dashed curve shows the corresponding trajectory in a vacuum. 
The dots show the ball’s position at one-second intervals. Air 
resistance slows the horizontal motion, so that the ball approaches 
a vertical asymptote just beyond x = 100 meters. 
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j effect of air resistance is to slow the horizontal motion so that x never moves to 
the right of a vertical asymptote near x — 100 m. 

The horizontal range of the baseball is easily read off the figure as the value 
of x when y returns to zero. We see that R ~ 59 m, as opposed to the range in 
vacuum, R vac % 90 m. The effect of air resistance is quite large in this example, 
as we might have anticipated: The ball was thrown with a speed only a little 
less than the terminal speed (30 vs 35 m/s), and this means that the force of air 
resistance is only a little less than that of gravity. This being the case, we should 
| expect air resistance to change the trajectory appreciably. 


This example illustrates several of the general features of projectile motion with a 
quadratic drag force. Although we cannot solve analytically the equations of motion 
(2.61) for this problem, we can use the equations to prove various general properties 
of the trajectory. For example, we noticed that the baseball reached a lower maximum 
height, and did so sooner, than it would have in a vacuum. It is easy to prove that this 
will always be the case: As long as the projectile is moving upward (v y > 0), the force 
of air resistance has a downward y component. Thus the downward acceleration is 
greater than g (its value in vacuum). Therefore a graph of v y against t slopes down from 
v yo more quickly than it would in vacuum, as shown in Figure 2.11. This guarantees 
that v y reaches zero sooner than it would in vacuum, and that the ball travels less 
distance (in the y direction) before reaching the high point. That is, the ball’s high 
point occurs sooner, and is lower, than it would be in a vacuum. 



Figure 2.11 Graph of v y against t for a projectile that is thrown 
upward ( v y0 > 0) and is subject to a quadratic resistance (solid 
curve). The dashed line (slope = —g) is the corresponding graph 
when there is no air resistance. The projectile moves upward 
until it reaches its maximum height when v y = 0. During this 
time, the drag force is downward and the downward acceleration 
is always greater than g. Therefore, the curve slopes more steeply 
than the dashed line, and the projectile reaches its high point 
sooner than it would in a vacuum. Since the area under the curve 
is less than that under the dashed line, the projectile’s maximum 
height is less than it would be in a vacuum. 
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I claimed that the baseball of Example 2.6 approaches a vertical asymptote as 
t -» oo, and we can now prove that this is always the case. First, it is easy to 
convince yourself that once the ball starts moving downward, it continues to accelerate 
downward, with v y approaching — u ter as / --> oo. At the same time v x continues to 
decrease and approaches zero. Thus the square root in both of the equations (2.61) 
approaches u ter . In particular, when t is large, the equation for v x can be approximated 
by 

• ^ cv ter , 

v x & - —v x = —kv x 

m 

say. The solution of this equation is, of course, an exponential function, v x = Ae~ kt , 
and we see that v x approaches zero very rapidly (exponentially) as t —> oo. This 
guarantees that x, which is the integral of v x , 

x(t) = / v x (t')dt', 

Jo 

approaches a finite limit as t —> oo, and the trajectory has a finite vertical asymptote 
as claimed. 


2.5 Motion of a Charge in a Uniform Magnetic Field _ 

Another interesting application of Newton’s laws, and (like projectile motion) an 
application that lets me introduce some important mathematical methods, is the 
motion of a charged particle in a magnetic field. I shall consider here a particle of 
charge q (which I shall usually take to be positive), moving in a uniform magnetic 
field B that points in the z direction as shown in Figure 2.12. The net force on the 
particle is just the magnetic force 

F = q\ x B, (2.62) 



Figure 2.12 A charged particle moving in a uniform mag¬ 
netic field that points in the z direction. 
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so the equation of motion can be written as 

m\ — qy x B. (2.63) 

[As with projectiles, the force depends only on the velocity (not the position), so the 
second law reduces to a first-order differential equation for v.] 

As is so often the case, the simplest way to solve the equation of motion is to 
resolve it into components. The components of v and B are 

V = (v x , v y , v z ) 


and 


B = (0,0, B ), 

from which we can read off the components of v x B: 

v xB = (^, -v x B, 0). 

Thus the three components of (2.63) are 

mv x = qBv y (2.64) 

mv y — —qBv x (2.65) 

mv z = 0. (2.66) 

The last of these says simply that v z , the component of the particle’s velocity in the 

direction of B, is constant: 


v z = const, 

a result we could have anticipated since the magnetic force is always perpendicular to 
B. Because v z is constant, we shall focus most of our attention on v x and v y . In fact, 
we can even think of them as comprising a two-dimensional vector ( v x , v y ), which is 
just the projection of v onto the xy plane and can be called the transverse velocity. 


(v x , v y ) = transverse velocity. 

To simplify the equations (2.64) and (2.65) for v x and v y , I shall define the 
parameter 



(2.67) 


which has the dimensions of inverse time and is called the cyclotron frequency. With 
this notation. Equations (2.64) and (2.65) become 


v x = covy 1 
v y = —a>v x . [ 


( 2 . 68 ) 


These two coupled differential equations can be solved in a host of different ways. 
I would like to describe one that makes use of complex numbers. Though perhaps 
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Figure 2.13 The complex number rj = v x .+ iv y is repre¬ 
sented as a point in the complex plane. The arrow pointing 
from O to rj is literally a picture of the transverse velocity 
vector (v x ,v y ). 


not the easiest solution, this method has surprisingly wide application in many ar¬ 
eas of physics. (For an alternative solution that avoids complex numbers, see Prob¬ 
lem 2.54.) 

The two variables v x and v y are, of course, real numbers. However, there is nothing 
to prevent us from defining a complex number 

rj = v x + iv y , (2.69) 

where i denotes the square root of — 1 (called j by most engineers), i = V—1 (and 
rj is the Greek letter eta). If we draw the complex number rj in the complex plane, 
or Argand diagram, then its two components are v x and v y as shown in Figure 2.13; 
in other words, the representation of r] in the complex plane is a picture of the two- 
dimensional transverse velocity (v x ,v y ). 

The advantages of introducing the complex number rj appear when we evaluate its 
derivative. Using (2.68), we find that 

r] — v x + iVy — cov y — icov x = - ia>(v x + iv y ) 


or 

fj = —icorj. (2.70) 

We see that the two coupled equations for v x and v y have become a single equation for 
the complex number rj. Furthermore, it is an equation of the now familiar form u = ku, 
whose solution we know to be the exponential u = Ae kt . Thus we can immediately 
write down the solution for ij: 


rj = Ae- io)t (2.71) 

Before we discuss the significance of this solution, I would like to review a few 
properties of complex exponentials in the next section. If you are very familiar with 
these ideas, by all means skip this material. 
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2.6 Complex Exponentials 


While you are certainly familiar with the exponential function e x for a real variable 
x, you may not be so at home with e z when z is complex. 6 For the real case there are 
several possible definitions of e x (for instance, as the function that is equal to its own 
derivative). The definition that extends most easily to the complex case is the Taylor 
series (see Problem 2.18) 


e z = l + z + ^ + ~^---. (2.72) 

2! 3! 

For any value of z, real or complex, large or small, this series converges to give a well- 
defined value for e z . By differentiating it, you can easily convince yourself that it has 
the expected property that it equals its own derivative. And one can show (not always 
so easily) that it has all the other familiar properties of the exponential function — for 
instance, that e z e w = e iz+w) . (See Problems 2.50 and 2.51.) In particular, the function 
Ae kz (with A and k any constants, real or complex) has the property that 

(Ae* z ) = k (A/ z ) . (2.73) 

Since it satisfies this same equation whatever the value of A , it is, in fact, the 
general solution of the first-order equation df/dz = kf. At the end of the last section, 
I introduced the complex number r)(t) and showed that it satisfied the equation 
x] = —icor). We are now justified in saying that this guarantees that rj must be the 
exponential function anticipated in (2.71). 

We shall be particularly concerned with the exponential of a pure imaginary 
number, that is, e 10 where # is a real number. The Taylor series (2.72) for this function 
reads 


2! 3! 4! 


(2.74) 


Noting that i 2 — —1, i 3 = -i, and so on, you can see that all of the even powers 
in this series are real, while all of the odd powers are pure imaginary. Regrouping 
accordingly, we can rewrite (2.74) to read 


a75) 


The series in the first brackets is the Taylor series for cos#, and that in the second 
brackets is sin# (Problem 2.18). Thus we have proved the important relation: 


e ,e = cos# + i sin#. 


(2.76) 


’ For a review of some elementary properties of complex numbers, see Problems 2.45 to 2.49. 
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Figure 2.14 (a) Euler’s formula, (2.76), implies that the complex num¬ 
ber e‘° lies on the unit circle (the circle of radius 1, centered on the origin 
O) with polar angle 6. (b) The complex constant A = ae ,s lies on a cir¬ 
cle of radius a with polar angle <5. The function rj (t ) = Ae~ ,cot lies on the 
same circle but with polar angle (5 — cot) and moves clockwise around 
the circle as t advances. 


This result, known as Euler’s formula, is illustrated in Figure 2.14(a). Note especially 
that the complex number e l ° has polar angle 6 and, since cos 2 0 + sin 2 0 = 1, the 
magnitude of e‘ e is 1; that is, e lB lies on the unit circle , the circle with radius 1 centered 
at O. 

Our main concern is with a complex number of the form r] — Ae~ lC0t . The co¬ 
efficient A is a fixed complex number, which can be expressed as A = ae lS , where 
a — | A | is the magnitude, and 8 is the polar angle of A, as shown in Figure 2.14(b). 
(See Problem 2.45.) The number rj can therefore be written as 

rj = Ae~ iwt = ae iS e~ iat = ae i(5_arf) . (2.77) 

Thus rj has the same magnitude as A (namely a), but has polar angle equal to ( 8 — cot), 
as shown in Figure 2.14(b). As a function of t, the number rj moves clockwise around 
the circle of radius a with angular velocity co. 

It is important that you get a good feel for the role of the complex constant A — ae ,s 
in (2.77): If A happened to equal 1, then rj would be just rj = e~ lQ)t , which lies on the 
unit circle, moving clockwise with angular velocity co and starting from the real axis 
(rj = 1) when t = 0. If A = a is real but not equal to 1, then it simply magnifies the 
unit circle to a circle of radius a, around which r] moves with the same angular speed 
and starting from the real axis, at rj = a when t = 0. Finally if A = ae lS , then the 
effect of the angle 8 is to rotate rj through the fixed angle 8, so that rj starts out at t = 0 
with polar angle <5. 

Armed with these mathematical results, we can now return to the charged particle 
in a magnetic field. 
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2.7 Solution for the Charge in a B Field 


Mathematically, the solution for the velocity v of our charged particle in a B field is 
complete, and all that remains is to interpret it physically. We already know that v z , 
the component along B, is constant. The components ( v x , u y ) transverse to B we have 
represented by the complex number q = v x + Iv and we have seen that Newton’s 
second law implies that rj has the time dependence rj = Ae~ ia)t , moving uniformly 
around the circle of Figure 2.14(b). Now, the arrow shown in that figure, pointing 
from O to r\, is in fact a pictorial representation of the transverse velocity (v x , v y ). 
Therefore this transverse velocity changes direction, turning clockwise, with constant 
angular velocity 7 co = qB/m and with constant magnitude. Because v z is constant, 
this suggests that the particle undergoes a spiralling, or helical, motion. To verify this, 
we have only to integrate v to find r as a function of t. 

That v z is constant implies that 

z(t) = z 0 + v zo t. (2.78) 

The motion of x and y is most easily found by introducing another complex number 
f = x + iy 


where £ is the Greek letter xi. In the complex plane, £ is a picture of the transverse 
position (x, y). Clearly, the derivative of g is tj, that is, | = rj. Therefore, 

£ = J qdt = J Ae~ mt dt 

i A 

= — e~ lCx)t + constant. (2.79) 

If we rename the coefficient iA/co as C and the constant of integration as X + iY, 
this implies that 

x T iy m Ce~ i0)t + (X + iY ). 


By redefining our origin so that the z axis goes through the point ( X , Y), we can 
eliminate the constant term on the right to give 

x + iy = Ce~ iwt , (2.80) 


and, by setting t = 0, we can identify the remaining constant C as 
C = x 0 + iy 0 . 

This result is illustrated in Figure 2.15. We see there that the transverse position (x, y) 
moves clockwise round a circle with angular velocity co = qB/m. Meanwhile z as 
given by (2.78) increases steadily, so the particle actually describes a uniform helix 
whose axis is parallel to the magnetic field. 


7 1 am assuming the charge q is positive; if q is negative, then co = qB/m is negative, meaning 
that the transverse velocity rotates counterclockwise. 
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Figure 2.15 Motion of a charge in a uniform magnetic 
field in the z direction. The transverse position (x, y ) 
moves around a circle as shown, while the coordinate z 
moves with constant velocity into or out of the page. 


There are many examples of the helical motion of a charged particle along a 
magnetic field; for example, cosmic-ray particles (charged particles hitting the earth 
from space) can get caught by the earth’s magnetic field and spiral north or south 
along the field lines. If the z component of the velocity happens to be zero, then 
the spiral reduces to a circle. In the cyclotron, a device for accelerating charged 
particles to high energies, the particles are trapped in circular orbits in this way. They 
are slowly accelerated by the judiciously timed application of an electric field. The 
angular frequency of the orbit is, of course, co = qB/m (which is why this is called 
the cyclotron frequency). The radius of the orbit is 


r = — = —— — . (2.81) 

co qB qB 

This radius increases as the particles accelerate, so that they eventually emerge at the 
outer edge of the circular magnets that produce the magnetic field. 

The same method that we have used here for a charge in a magnetic field can 
also be used for a particle in magnetic and electric fields, but since this complication 
adds nothing to the method of solution, I shall leave you to try it for yourself in 
Problems 2.53 and 2.55. 


Principal Definitions and Equations of Chapter 2 _ 

Linear and Quadratic Drags 

Provided the speed v is well below that of sound, the magnitude of the drag force 
f = —f(v)v on an object moving through a fluid is usually well approximated as 

f(v) = fun + /quad 


where 

An = bv = /3Dv and / quad = cv 2 = yD 2 v 2 . [Eqs. (2.2) to (2.6)] 
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Here D denotes the linear size of the object. For a sphere, D is the diameter and, for 
a sphere in air at STP, = 1.6 x 10~ 4 N-s/m 2 and y — 0.25 N-s 2 /m 4 . 

The Lorentz Force on a Charged Particle 


F = q(E + v x B). [Eq. (2.62) & Problem 2.53] 


Problems for Chapter 2 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (+**). 


section 2.1 Air Resistance 

2.1 * When a baseball flies through the air, the ratio / qua( j//iin of the quadratic to the linear drag force 
is given by (2.7). Given that a baseball has diameter 7 cm, find the approximate speed v at which the 
two drag forces are equally important. For what approximate range of speeds is it safe to treat the drag 
force as purely quadratic? Under normal conditions is it a good approximation to ignore the linear 
term? Answer the same questions for a beach ball of diameter 70 cm. 

2.2 * The origin of the linear drag force on a sphere in a fluid is the viscosity of the fluid. According 
to Stokes’s law, the viscous drag on a sphere is 


/lin = 37t r)Dv 


(2.82) 


where r\ is the viscosity 8 of the fluid, D the sphere’s diameter, and v its speed. Show that this expression 
reproduces the form (2.3) for / lin , with b given by (2.4) as b = fD. Given that the viscosity of air at 
STP is pm 1.7 x 10 -5 N-s/m 2 , verify the value of ft given in (2.5). 

2.3 * (a) The quadratic and linear drag forces on a moving sphere in a fluid are given by (2.84) and 
(2.82) (Problems 2.4 and 2.2). Show that the ratio of these two kinds of drag force can be written as 
/quad//iin = R /48, 9 where the dimensionless Reynolds number R is 


n 


(2.83) 


where D is the sphere’s diameter, v its speed, and q and r\ are the fluid’s density and viscosity. Clearly 
the Reynolds number is a measure of the relative importance of the two kinds of drag. 10 When R is 


8 For the record, the viscosity rj of a fluid is defined as follows: Imagine a wide channel along which fluid is 
flowing (x direction) such that the velocity v is zero at the bottom (y = 0) and increases toward the top (y = h ), so 
that successive layers of fluid slide across one another with a velocity gradient dv/dy. The force F with which an 
area A of any one layer drags the fluid above it is proportional to A and to dv/dy, and rj is defined as the constant 
of proportionality; that is, F = t) A dv/dy. 

9 The numerical factor 48 is for a sphere. A similar result holds for other bodies, but the numerical factor is 
different for different shapes. 

10 The Reynolds number is usually defined by (2.83) for flow involving any object, with D defined as a typical 
linear dimension. One sometimes hears the claim that R is the ratio / qua d//iin- Since / qu ad//iin — /?/48fora sphere, 
this claim would be better phrased as “R is roughly of the order of / qua d//ii n -” 
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very large, the quadratic drag is dominant and the linear can be neglected; vice versa when R is very 
small, (b) Find the Reynolds number for a steel ball bearing (diameter 2 mm) moving at 5 cm/s through 
glycerin (density 1.3 g/cm 3 and viscosity 12 N s/m 2 at STP). 

2.4 ** The origin of the quadratic drag force on any projectile in a fluid is the inertia of the fluid that the 
projectile sweeps up. (a) Assuming the projectile has a cross-sectional area A (normal to its velocity) 
and speed v, and that the density of the fluid is q, show that the rate at which the projectile encounters 
fluid (mass/time) is qAv. (b) Making the simplifying assumption that all of this fluid is accelerated 
to the speed v of the projectile, show that the net drag force on the projectile is qAv 2 . It is certainly 
not true that all the fluid that the projectile encounters is accelerated to the full speed v, but one might 
guess that the actual force would have the form 

/quad = KQAv 2 (2.84) 

where k is a number less than 1, which would depend on the shape of the projectile, with k small for 
a streamlined body, and larger for a body with a flat front end. This proves to be true, and for a sphere 
the factor k is found to be tc = 1/4. (c) Show that (2.84) reproduces the form (2.3) for / quad , with c 
given by (2.4) as c = yD 2 . Given that the density of air at STP is q = 1.29 kg/m 3 and that k = 1/4 for 
a sphere, verify the value of y given in (2.6). 


section 2.2 Linear Air Resistance 

2.5 ★ Suppose that a projectile which is subject to a linear resistive force is thrown vertically down 
with a speed v y0 which is greater than the terminal speed u ter . Describe and explain how the velocity 
varies with time, and make a plot of v y against t for the case that v yo = 2v ter . 

2.6 * (a) Equation (2.33) gives the velocity of an object dropped from rest. At first, when v y is small, 
air resistance should be unimportant and (2.33) should agree with the elementary result v y = gt 
for free fall in a vacuum. Prove that this is the case. [Hint: Remember the Taylor series for e x = 
1 + x + x 2 /2\ + x 3 /3 ! + •••, for which the first two or three terms are certainly a good approximation 
when x is small.] (b) The position of the dropped object is given by (2.35) with v y0 = 0. Show similarly 
that this reduces to the familiar y = \gt 2 when t is small. 

2.7 * There are certain simple one-dimensional problems where the equation of motion (Newton’s 
second law) can always be solved, or at least reduced to the problem of doing an integral. One of these 
(which we have met a couple of times in this chapter) is the motion of a one-dimensional particle subject 
to a force that depends only on the velocity v, that is, F = F(v). Write down Newton’s second law and 
separate the variables by rewriting it as m dv/F(y) = dt. Now integrate both sides of this equation and 
show that 


dv' 

W)' 


Provided you can do the integral, this gives t as a function of v. You can then solve to give v as a 
function of t. Use this method to solve the special case that F(v) = F 0 , a constant, and comment on 
your result. This method of separation of variables is used again in Problems 2.8 and 2.9. 

2.8 ★ A mass m has velocity v 0 at time t = 0 and coasts along the x axis in a medium where the drag 
force is F(v) = —cv 3 ^ 2 . Use the method of Problem 2.7 to find v in terms of the time t and the other 
given parameters. At what time (if any) will it come to rest? 
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2.9* We solved the differential equation (2.29), mv y = —b(v y — u ter ), for the velocity of an object 
falling through air, by inspection — a most respectable way of solving differential equations. Never¬ 
theless, one would sometimes like a more systematic method, and here is one. Rewrite the equation in 
the “separated” form 


Vy - V ter 

and integrate both sides from time 0 to t to find v y as a function of t. Compare with (2.30). 

2.10** For a steel ball bearing (diameter 2 mm and density 7.8 g/cm 3 ) dropped in glycerin (density 
1.3 g/cm 3 and viscosity 12 N-s/m 2 at STP), the dominant drag force is the linear drag given by (2.82) 
of Problem 2.2. (a) Find the characteristic time r and the terminal speed v ter . [In finding the latter, 
you should include the buoyant force of Archimedes. This just adds a third force on the right side of 
Equation (2.25).] How long after it is dropped from rest will the ball bearing have reached 95% of its 
terminal speed? (b) Use (2.82) and (2.84) (with k — 1/4 since the ball bearing is a sphere) to compute 
the ratio / qua d//iin at the terminal speed. Was it a good approximation to neglect / quad ? 

2.11 ** Consider an object that is thrown vertically up with initial speed v Q in a linear medium, 

(a) Measuring y upward from the point of release, write expressions for the object’s velocity v y (t) 
and position y(t). (b) Find the time for the object to reach its highest point and its position y max at that 
point, (c) Show that as the drag coefficient approaches zero, your last answer reduces to the well-known 
result y max = > 2 /g for an object in the vacuum. [Hint: If the drag force is very small, the terminal 

speed is very big, so v 0 /v tM is very small. Use the Taylor series for the log function to approximate 
ln(l + S) by 8 — \8 2 . (For a little more on Taylor series see Problem 2.18.)] 

2.12 ** Problem 2.7 is about a class of one-dimensional problems that can always be reduced to doing 
an integral. Here is another. Show that if the net force on a one-dimensional particle depends only on 
position, F — F(x), then Newton’s second law can be solved to find rasa function of x given by 

v 2 = v 2 + — f F(x') dx'. (2.85) 

m J Xo 


[Hint: Use the chain rule to prove the following handy relation, which we could call the u v dv/dx rule”: 
If you regard rasa function of x, then 


. _ dv _ 1 dv 2 
dx 2 dx 


( 2 . 86 ) 


Use this to rewrite Newton’s second law in the separated form md(v 2 ) = 2 F(x)dx and then 
integrate from x 0 to x.] Comment on your result for the case that F(x) is actually a constant. (You 
may recognise your solution as a statement about kinetic energy and work, both of which we shall be 
discussing in Chapter 4.) 

2.13 ** Consider a mass m constrained to move on the x axis and subject to a net force F = —kx where 
k is a positive constant. The mass is released from rest at x = x 0 at time t = 0. Use the result (2.85) 
in Problem 2.12 to find the mass’s speed as a function of x; that is, dx/dt = g(x) for some function 
g(x). Separate this as dx/g(x) = dt and integrate from time 0 to t to find x as a function of t. (You 
may recognize this as one way — not the easiest—to solve the simple harmonic oscillator.) 

2.14 *** Use the method of Problem 2.7 to solve the following: A mass m is constrained to move along 
the x axis subject to a force F(v) = —F 0 e v/V , where F 0 and V are constants, (a) Find v(t) if the initial 
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velocity is u 0 > 0 at time t = 0. (b) At what time does it come instantaneously to rest? (c) By integrating 
v{t), you can find x(t). Do this and find how far the mass travels before coming instantaneously to rest. 

section 2.3 Trajectory and Range in a Linear Medium 

2.15 * Consider a projectile launched with velocity ( v xo , v y0 ) from horizontal ground (with * measured 
horizontally and y vertically up). Assuming no air resistance, find how long the projectile is in the air 
and show that the distance it travels before landing (the horizontal range) is 2v xo v yo /g. 

2.16 * A golfer hits his ball with speed v 0 at an angle 9 above the horizontal ground. Assuming that 
the angle 9 is fixed and that air resistance can be neglected, what is the minimum speed u 0 (min) for 
which the ball will clear a wall of height h, a distance d away? Your solution should get into trouble if 
the angle 9 is such that tan 9 < h/d. Explain. What is v 0 (min) if 9 = 25°, d — 50 m, and h = 2 m? 

2.17* The two equations (2.36) give a projectile’s position (x, y) as a function of t. Eliminate t to 
give y as a function of x. Verify Equation (2.37). 

2.18 * Taylor’s theorem states that, for any reasonable function f(x), the value of / at a point (x + S) 
can be expressed as an infinite series involving / and its derivatives at the point x: 

f(x + 8) = f(x) + f'(x)8 + j/'(x)8 2 + j/"(x)8 3 + ■■■ (2.87) 

where the primes denote successive derivatives of fix). (Depending on the function this series may 
converge for any increment 5 or only for values of <5 less than some nonzero “radius of convergence.”) 
This theorem is enormously useful, especially for small values of 8, when the first one or two terms of 
the series are often an excellent approximation. 11 (a) Find the Taylor series for ln(l + 5). (b) Do the 
same for cos 8. (c) Likewise sin 8. (d) And e s . 

2.19 ★ Consider the projectile of Section 2.3. (a) Assuming there is no air resistance, write down the 
position (x, y) as a function of t, and eliminate t to give the trajectory y as a function of x. (b) The 
correct trajectory, including a linear drag force, is given by (2.37). Show that this reduces to your 
answer for part (a) when air resistance is switched off (r and u ter = gr both approach infinity). [Hint: 
Remember the Taylor series (2.40) for ln(l — e).] 

2.20 ** [Computer] Use suitable graph-plotting software to plot graphs of the trajectory (2.36) of a 
projectile thrown at 45°above the horizontal and subject to linear air resistance for four different values 
of the drag coefficient, ranging from a significant amount of drag down to no drag at all. Put all four 
trajectories on the same plot. [Hint: In the absence of any given numbers, you may as well choose 
convenient values. For example, why not take v x0 = v yo = 1 and g = 1. (This amounts to choosing your 
units of length and time so that these parameters have the value 1.) With these choices, the strength 
of the drag is given by the one parameter u ter = r, and you might choose to plot the trajectories for 
u ter = 0.3,1, 3, and oo (that is, no drag at all), and for times from t = 0 to 3. For the case that u ter = oo, 
you’ll probably want to write out the trajectory separately.] 

2.21 A gun can fire shells in any direction with the same speed v 0 . Ignoring air resistance and 
using cylindrical polar coordinates with the gun at the origin and z measured vertically up, show that 


11 For more details on Taylor’s series see, for example, Mary Boas, Mathematical Methods in the Physical Sci¬ 
ences (Wiley, 1983), p. 22 or Donald McQuarrie, Mathematical Methods for Scientists and Engineers (University 
Science Books, 2003), p. 94. 
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the gun can hit any object inside the surface 


2g 2v2 

Describe this surface and comment on its dimensions. 


2.22 **★ [Computer] The equation (2.39) for the range of a projectile in a linear medium cannot be 
solved analytically in terms of elementary functions. If you put in numbers for the several parameters, 
then it can be solved numerically using any of several software packages such as Mathematica, Maple, 
and MatLab. To practice this, do the following: Consider a projectile launched at angle 9 above the 
horizontal ground with initial speed v Q in a linear medium. Choose units such that v 0 = 1 and g = 1. 
Suppose also that the terminal speed u ter = 1. (With v 0 = u ter , air resistance should be fairly important.) 
We know that in a vacuum, the maximum range occurs at 9 = jr/4 « 0.75. (a) What is the maximum 
range in a vacuum? (b) Now solve (2.39) for the range in the given medium at the same angle 9 = 0.75. 
(c) Once you have your calculation working, repeat it for some selection of values of 9 within which 
the maximum range probably lies. (You could try 9 — 0.4, 0.5, • • •, 0.8.) (d) Based on these results, 
choose a smaller interval for 9 where you’re sure the maximum lies and repeat the process. Repeat it 
again if necessary until you know the maximum range and the corresponding angle to two significant 
figures. Compare with the vacuum values. 


section 2.4 Quadratic Air Resistance 

2.23 * Find the terminal speeds in air of (a) a steel ball bearing of diameter 3 mm, (b) a 16-pound steel 
shot, and (c) a 200-pound parachutist in free fall in the fetal position. In all three cases, you can safely 
assume the drag force is purely quadratic. The density of steel is about 8 g/cm 3 and you can treat the 
parachutist as a sphere of density 1 g/cm 3 . 

2.24 * Consider a sphere (diameter D, density £ sph ) falling through air (density g air ) and assume that 
the drag force is purely quadratic, (a) Use Equation (2.84) from Problem 2.4 (with k = 1/4 for a sphere) 
to show that the terminal speed is 


= ( 2 - 88 ) 

(b) Use this result to show that of two spheres of the same size, the denser one will eventually fall 
faster, (c) For two spheres of the same material, show that the larger will eventually fall faster. 

2.25 ★ Consider the cyclist of Section 2.4, coasting to a halt under the influence of a quadratic drag 
force. Derive in detail the results (2.49) and (2.51) for her velocity and position, and verify that the 
constant r = m/cv 0 is indeed a time. 

2.26 * A typical value for the coefficient of quadratic air resistance on a cyclist is around c = 0.20 
N/(m/s) 2 . Assuming that the total mass (cyclist plus cycle) is m = 80 kg and that at t = 0 the cyclist 
has an initial speed v 0 = 20 m/s (about 45 mi/h) and starts to coast to a stop under the influence of air 
resistance, find the characteristic time r = m/cv 0 . How long will it take him to slow to 15 m/s? What 
about 10 m/s? And 5 m/s? (Below about 5 m/s, it is certainly not reasonable to ignore friction, so there 
is no point pursuing this calculation to lower speeds.) 
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2.27* I kick a puck of mass m up an incline (angle of slope = 0) with initial speed v Q . There is no 
friction between the puck and the incline, but there is air resistance with magnitude f(v ) = cv 2 . Write 
down and solve Newton’s second law for the puck’s velocity as a function of t on the upward journey. 
How long does the upward journey last? 

2.28 ★ A mass m has speed v 0 at the origin and coasts along the x axis in a medium where the drag 
force is F(v) = —cv 3 / 2 . Use the “v dv/dx rule” (2.86) in Problem 2.12 to write the equation of motion 
in the separated form m v dv/F(v) = dx, and then integrate both sides to give x in terms of v (or vice 
versa). Show that it will eventually travel a distance 

2.29* The terminal speed of a 70-kg sky diver in spread-eagle position is around 50 m/s (about 115 
mi/h). Find his speed at times 7 = 1,5, 10, 20, 30 seconds after he jumps from a stationary balloon. 
Compare with the corresponding speeds if there were no air resistance. 

2.30 * Suppose we wish to approximate the skydiver of Problem 2.29 as a sphere (not a very promising 
approximation, but nevertheless the kind of approximation physicists sometimes like to make). Given 
the mass and terminal speed, what should we use for the diameter of the sphere? Does your answer 
seem reasonable? 

2.31 ** A basketball has mass m = 600 g and diameter D = 24 cm. (a) What is its terminal speed? 

(b) If it is dropped from a 30-m tower, how long does it take to hit the ground and how fast is it going 
when it does so? Compare with the corresponding numbers in a vacuum. 

2.32 ** Consider the following statement: If at all times during a projectile’s flight its speed is much 
less than the terminal speed, the effects of air resistance are usually very small, (a) Without reference 
to the explicit equations for the magnitude of n ter , explain clearly why this is so. (b) By examining the 
explicit formulas (2.26) and (2.53) explain why the statement above is even more useful for the case 
of quadratic drag than for the linear case. [Hint: Express the ratio //mg of the drag to the weight in 
terms of the ratio n/n ter .] 

2.33 ★* The hyperbolic functions cosh z and sinh z are defined as follows: 

cosh z = 6 — and sinh z = --— 

2 2 

for any z, real or complex, (a) Sketch the behavior of both functions over a suitable range of real 
values of z. (b) Show that cosh z = cos(iz). What is the corresponding relation for sinh z? (c) What are 
the derivatives of coshz and sinhz? What about their integrals? (d) Show that cosh 2 z — sinh 2 z = 1. 
(e) Show that f dx/*J\ + x 2 = arcsinhx. [Hint: One way to do this is to make the substitution 
x = sinh z.] 

2.34** The hyperbolic function tanhz is defined as tanhz = sinhz/coshz, with coshz and sinhz 
defined as in Problem 2.33. (a) Prove that tanhz = —i tan(/z). (b) What is the derivative of tanhz? 

(c) Show that f dz tanhz = In coshz. (d) Prove that 1 — tanh 2 z = sech 2 z, where sechz = 1/coshz. 
(e) Show that f dx/( 1 — x 2 ) = arctanhx. 

2.35 ** (a) Fill in the details of the arguments leading from the equation of motion (2.52) to Equations 
(2.57) and (2.58) for the velocity and position of a dropped object subject to quadratic air resistance. 
Be sure to do the two integrals involved. (The results of Problem 2.34 will help.) (b) Tidy the two 
equations by introducing the parameter r = n ter /g. Show that when t = r, v has reached 76% of its 
terminal value. What are the corresponding percentages when t = 2r and 3r? (c) Show that when 
t » r, the position is approximately y « u ter f + const. [Hint: The definition of coshx (Problem 2.33) 
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gives you a simple approximation when x is large.] (d) Show that for t small, Equation (2.58) for the 
position gives y ~ {gt 1 . [Use the Taylor series for coshx and for ln(l + 5).] 

2.36 ** Consider the following quote from Galileo’s Dialogues Concerning Two New Sciences: 

Aristotle says that “an iron ball of 100 pounds falling from a height of one hundred cubits reaches the 
ground before a one-pound ball has fallen a single cubit.” I say that they arrive at the same time. You 
find, on making the experiment, that the larger outstrips the smaller by two finger-breadths, that is, when 
the larger has reached the ground, the other is short of it by two finger-breadths. 

We know that the statement attributed to Aristotle is totally wrong, but just how close is Galileo’s claim 
that the difference is just “two finger breadths”? (a) Given that the density of iron is about 8 g/cm 3 , find 
the terminal speeds of the two iron balls, (b) Given that a cubit is about 2 feet, use Equation (2.58) to 
find the time for the heavier ball to land and then the position of the lighter ball at that time. How far 
apart are they? 

2.37 ** The result (2.57) for the velocity of a falling object was found by integrating Equation (2.55) 
and the quickest way to do this is to use the integral f du/(\ — u 2 ) = arctanhn. Here is another way 
to do it: Integrate (2.55) using the method of “partial fractions,” writing 

1 _ 1 / 1 1 \ 

1 — u 2 2 \ 1 T w 1 — u ) ' 

which lets you do the integral in terms of natural logs. Solve the resulting equation to give v as a 
function of t and show that your answer agrees with (2.57). 

2.38 ** A projectile that is subject to quadratic air resistance is thrown vertically up with initial 
speed v Q . (a) Write down the equation of motion for the upward motion and solve it to give v as a 
function of t. (b) Show that the time to reach the top of the trajectory is 

hop = (Uer/ 8) nrctan(u 0 /i) [er ). 

(c) For the baseball of Example 2.5 (with u ter = 35 m/s), find r top for the cases that v 0 = 1,10, 20, 30, 
and 40 m/s, and compare with the corresponding values in a vacuum. 

2.39 ** When a cyclist coasts to a stop, he is actually subject to two forces, the quadratic force of air 
resistance, / = — cv 2 (with c as given in Problem 2.26), and a constant frictional force / fr of about 3 
N. The former is dominant at high and medium speeds, the latter at low speed. (The frictional force is a 
combination of ordinary friction in the bearings and rolling friction of the tires on the road.) (a) Write 
down the equation of motion while the cyclist is coasting to a stop. Solve it by separating variables to 
give t as a function of v. (b) Using the numbers of Problem 2.26 (and the value / fr = 3 N given above) 
find how long it takes the cyclist to slow from his initial 20 m/s to 15 m/s. How long to slow to 10 and 
5 m/s? How long to come to a full stop? If you did Problem 2.26, compare with the answers you got 
there ignoring friction entirely. 

2.40 ** Consider an object that is coasting horizontally (positive x direction) subject to a drag force 
/ = — bv — cv 2 . Write down Newton’s second law for this object and solve for v by separating 
variables. Sketch the behavior of v as a function of t. Explain the time dependence for t large. (Which 
force term is dominant when t is large?) 

2.41 *★ A baseball is thrown vertically up with speed v Q and is subject to a quadratic drag with 
magnitude f(v) = cv 2 . Write down the equation of motion for the upward journey (measuring y 
vertically up) and show that it can be rewritten as i = — g[l + (u/u ter ) 2 ]. Use the “vdv/dx rule” 
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(2.86) to write v as vdv/dy, and then solve the equation of motion by separating variables (put all 
terms involving v on one side and all terms involving y on the other). Integrate both sides to give y in 
terms of v, and hence v as a function of y. Show that the baseball’s maximum height is 


If v 0 = 20 m/s (about 45 mph) and the baseball has the parameters given in Example 2.5 (page 61), 
what is y max ? Compare with the value in a vacuum. 

2.42 *★ Consider again the baseball of Problem 2.41 and write down the equation of motion for the 
downward journey. (Notice that with a quadratic drag the downward equation is different from the 
upward one, and has to be treated separately.) Find v as a function of y and, given that the downward 
journey starts at y max as given in (2.89), show that the speed when the ball returns to the ground is 
v tei v o/yj v teT + v o- Discuss this result for the cases of very much and very little air resistance. What 
is the numerical value of this speed for the baseball of Problem 2.41? Compare with the value in a 
vacuum. 


2.43 [Computer] The basketball of Problem 2.31 is thrown from a height of 2 m with initial velocity 
v c = 15 m/s at 45° above the horizontal, (a) Use appropriate software to solve the equations of motion 
(2.61) for the ball’s position (x, y) and plot the trajectory. Show the corresponding trajectory in the 
absence of air resistance, (b) Use your plot to find how far the ball travels in the horizontal direction 
before it hits the floor. Compare with the corresponding range in a vacuum. 

2.44 [Computer] To get an accurate trajectory for a projectile one must often take account of several 
complications. For example, if a projectile goes very high then we have to allow for the reduction 
in air resistance as atmospheric density decreases. To illustrate this, consider an iron cannonball 
(diameter 15 cm, density 7.8 g/cm 3 ) that is fired with initial velocity 300 m/s at 50 degrees above 
the horizontal. The drag force is approximately quadratic, but since the drag is proportional to the 
atmospheric density and the density falls off exponentially with height, the drag force is / = c(y)v 2 
where c(y) = yD 2 exp(— y/X) with y given by (2.6) and A ~ 10, 000 m. (a) Write down the equations 
of motion for the cannonball and use appropriate software to solve numerically for x(t) and y(t) for 
0 < t < 3.5 s. Plot the ball’s trajectory and find its horizontal range, (b) Do the same calculation ignoring 
the variation of atmospheric density [that is, setting c(y) = c(0)], and yet again ignoring air resistance 
entirely. Plot all three trajectories for 0 < t < 3.5 s on the same graph. You will find that in this case 
air resistance makes a huge difference and that the variation of air resistance makes a small, but not 
negligible, difference. 


section 2.6 Complex Exponentials 

2.45 * (a) Using Euler’s relation (2.76), prove that any complex number z — x + iy can be written in 
the form z — re te , where r and 0 are real. Describe the significance of r and 0 with reference to the 
complex plane, (b) Write z = 3 + 4i in the form z = re l6 . (c) Write z = 2e~ in ^ 3 in the form x + iy. 

2.46 * For any complex number z = x + iy, the real and imaginary parts are defined as the real 
numbers Re(z) = '$ and Im(z) = y. The modulus or absolute value is |z| = x 2 + y 2 and the phase 
or angle is the value of 6 when z is expressed as z = re 10 . The complex conjugate is z* = x — iy. 
(This last is the notation used by most physicists; most mathematicians use z.) For each of the following 
complex numbers, find the real and imaginary parts, the modulus and phase, and the complex conjugate. 
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and sketch z and z* in the complex plane: 

(a) z = 1 + i (b)z = l-/V3 
(c) z = y/2e- in ' A (d) z = 5e icot . 

In part (d), co is a constant and t is the time. 

2.47 * For each of the following two pairs of numbers compute z + w, z — w, zw, and z/w. 

(a) z = 6 + 8/ and w = 3 — 4i (b) z = %e 171 ^ and w = 4e l7t / 6 . 

Notice that for adding and subtracting complex numbers, the form x + iy is more convenient, but for 
multiplying and especially dividing, the form re lG is more convenient. In part (a), a clever trick for 
finding z/w without converting to the form re ld is to multiply top and bottom by w*\ try this one both 
ways. 

2.48 ★ Prove that |z| = sfzFz for any complex number z. 

2.49 ★ Consider the complex number z = e l ° = cos 6 + i sin 9 . (a) By evaluating z 2 two different ways, 
prove the trig identities cos 26 = cos 2 6 — sin 2 6 and sin 26 = 2 sin 6 cos 6. (b) Use the same technique 
to find corresponding identities for cos 36 and sin 36. 

2.50 + Use the series definition (2.72) of e z to prove that 12 de z /dz - e z . 

2.51 ** Use the series definition (2.72) of e z to prove that e z e w = e z+w . [Hint: If you write down the 
left side as a product of two series, you will have a huge sum of terms like z n w m . If you group together 
all the terms for which n + m is the same (call it p) and use the binomial theorem, you will find you 
have the series for the right side.] 

section 2.7 Solution for the Charge in a B Field 

2.52 * The transverse velocity of the particle in Sections 2.5 and 2.7 is contained in (2.77), since 
q = v x + iv y . By taking the real and imaginary parts, find expressions for v x and v y separately. Based 
on these expressions describe the time dependence of the transverse velocity. 

2.53 * A charged particle of mass m and positive charge q moves in uniform electric and magnetic 
fields, E and B, both pointing in the z direction. The net force on the particle is F = q(E + v x B). 
Write down the equation of motion for the particle and resolve it into its three components. Solve the 
equations and describe the particle’s motion. 

2.54 ** In Section 2.5 we solved the equations of motion (2.68) for the transverse velocity of a charge 
in a magnetic field by the trick of using the complex number q = v x + iv y . As you might imagine, 
the equations can certainly be solved without this trick. Here is one way: (a) Differentiate the first of 
equations (2.68) with respect to t and use the second to give you a second-order differential equation 
for v x . This is an equation you should recognize [if not, look at Equation (1.55)] and you can write 
down its general solution. Once you know v x , (2.68) tells you v y . (b) Show that the general solution 
you get here is the same as the general solution contained in (2.77), as disentangled in Problem 2.52. 


12 If you are the type who worries about mathematical niceties, you may be wondering if it is permissible 
to differentiate an infinite series. Fortunately, in the case of a power series (such as this), there is a theorem 
that guarantees the series can be differentiated for any z inside the “radius of convergence.” Since the radius of 
convergence of the series for e z is infinite, we can differentiate it for any value of z. 
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2.55 **★ A charged particle of mass m and positive charge q moves in uniform electric and magnetic 
fields, E pointing in the y direction and B in the z direction (an arrangement called “crossed E and B 
fields”)- Suppose the particle is initially at the origin and is given a kick at time t = 0 along the x axis 
with v x = v x0 (positive or negative), (a) Write down the equation of motion for the particle and resolve 
it into its three components. Show that the motion remains in the plane z = 0. (b) Prove that there is 
a unique value of v x0 , called the drift speed t> dr , for which the particle moves undeflected through the 
fields. (This is the basis of velocity selectors, which select particles traveling at one chosen speed from 
a beam with many different speeds.) (c) Solve the equations of motion to give the particle’s velocity 
as a function of t, for arbitrary values of v x0 . [Hint: The equations for (v x , v y ) should look very like 
Equations (2.68) except for an offset of v x by a constant. If you make a change of variables of the 
form u x = v x — u dr and u y = v y , the equations for (u x , u y ) will have exactly the form (2.68), whose 
general solution you know.] (d) Integrate the velocity to find the position as a function of t and sketch 
the trajectory for various values of v xo . 




CHAPTER 


Momentum and 
Angular Momentum 


In this and the next chapter I shall describe the great conservation laws of momentum, 
angular momentum, and energy. These three laws are closely related to one another 
and are perhaps the most important of the small number of conservation laws that are 
regarded as cornerstones of all modern physics. Curiously, in classical mechanics the 
first two laws (momentum and angular momentum) are very different from the last 
(energy). It is a relatively easy matter to prove the first two from Newton’s laws (indeed 
we already have proved conservation of momentum), whereas the proof of energy 
conservation is surprisingly subtle. I discuss momentum and angular momentum in 
this rather short chapter and energy in Chapter 4, which is appreciably longer. 


3.1 Conservation of Momentum 


In Chapter 1 we examined a system of N particles labeled a — l, ■ ■ ■, N. We found 
that as long as all the internal forces obey Newton’s third law, the rate of change of the 
system’s total linear momentum P = pi + • • • + Pa * s determined entirely 

by the external forces on the system: 


P = F ext (3.1) 

where F ext denotes the total external force on the system. Because of the third law, the 
internal forces all cancel out of the rate of change of the total momentum. In particular, 
if the system is isolated, so that the total external force is zero, we have the 


Principle of Conservation of Momentum 
If the net external force F 6 * 1 on an iV-particle system is zero, the system’s total 
mechanical momentum P = J2 m <* v a l& constant 
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If our system contains just one particle (N = 1), then all forces on the particle are 
external, and the conservation of momentum is reduced to the not very interesting 
statement that, in the absence of any forces, the momentum of a single particle 
is constant, which is just Newton’s first law. However, if our system has two or 
more particles (N > 2), then momentum conservation is a nontrivial and often useful 
property, as the following simple and well-known example will remind you. 


example 3.1 An Inelastic Collision of Two Bodies 

Two bodies (two lumps of putty, for example, or two cars at an intersection) 
have masses m x and m 2 and velocities Vj and v 2 . The two bodies collide and lock 
together, so they move off as a single unit, as shown in Figure 3.1. (A collision 
in which the bodies lock together like this is said to be perfectly inelastic .) 
Assuming that any external forces are negligible during the brief moment of 
collision, find the velocity v just after the collision. 

The initial total momentum, just before the collision, is 

Pin = m i + m 2 v 2 , 

and the final momentum, just after the collision, is 


Pfin = m l v + ™ 2 V = Oh + m 2 )v. 


(Notice that this last equation illustrates the useful result that, once two bodies 
have locked together, we can find their momentum by considering them as 
a single body of mass m x + ra 2 .) By conservation of momentum these two 
momenta must be equal, P fin = P in , and we can easily solve to give the final 
velocity, 

Y= r m + m 2 y 1 (32) 

m \ + m 2 

We see that the final velocity is just the weighted average of the original veloc¬ 
ities Vj and v 2 , weighted by the corresponding masses m { and m 2 . 


: 



Figure 3.1 A perfectly inelastic collision between 
two lumps of putty. 
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An important special case is when one of the bodies is initially at rest, as 
when a speeding car rams a stationary car at a stop light. With v 2 = 0, Equation 
(3.2) reduces to 


m l + m 2 

In this case the final velocity is always in the same direction as Vj but is reduced 
by the factor m,/ (m l + m 2 ). The result (3.3) is used by police investigating car 
crashes, since it lets them find the unknown velocity Vj of a speeder who has 
rear-ended a stationary car, in terms of quantities that can be measured after the 
event. (The final velocity v can be found from the skid marks of the combined 
wreck.) 

This sort of analysis of collisions, using conservation of momentum, is 
an important tool in solving many problems ranging from nuclear reactions, 
through car crashes, to collisions of galaxies. 


3.2 Rockets 


A beautiful example of the use of momentum conservation is the analysis of rocket 
propulsion. The basic problem that is solved by the rocket is this: With no external 
agent to push on or be pushed by, how does an object get itself moving? You can 
put yourself in the same difficulty by imagining yourself stranded on a perfectly 
frictionless frozen lake. The simplest way to get yourself to shore is to take off anything 
that is dispensible, such as a boot, and throw it as hard as possible away from the shore. 
By Newton’s third law, when you push one way on the boot, the boot pushes in the 
opposite direction on you. Thus as you throw the boot, the reaction force of the boot 
on you will cause you to recoil in the opposite direction and then glide across the ice 
to shore. A rocket does essentially the same thing. Its motor is designed to hurl the 
spent fuel out of the back of the rocket, and by the third law, the fuel pushes the rocket 
forward. 

To analyse a rocket’s motion quantitatively we must examine the total momentum. 
Consider the rocket shown in Figure 3.2 with mass m, traveling in the positive x 
direction (so I can abbreviate v x as just v) and ejecting spent fuel at the exhaust speed 
v ex relative to the rocket. Since the rocket is ejecting mass, the rocket’s mass m is 
steadily decreasing. At time t, the momentum is P(t) = mv. A short time later at 1 
t + dt, the rocket’s mass is (m + dm), where dm is negative, and its momentum is 
(m + dm)(v + dv). The fuel ejected in the time dt has mass (—dm) and velocity 


Concerning the use of the small quantities like dt and dm, I recommend again the view that 
they are small but nonzero increments, with dt chosen sufficiently small that dm divided by dt is 
(within whatever we have chosen as our desired accuracy) equal to the derivative dm/dt. For more 
details, see the footnote immediately before Equation (2.47). 
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Figure 3.2 A rocket of mass m travels to the right with 
speed v and ejects spent fuel with exhaust speed v ex rela¬ 
tive to the rocket. 


v — v ex relative to the ground. Thus the total momentum (rocket plus the fuel just 
ejected) at t + dt is 

Pit + dt) = (m + dm){v + dv ) — dm(v — v ex ) = mv + mdv + dm v ex 

where I have neglected the doubly small product dm dv. Therefore, the change in total 
momentum is 


dP = P(t + dt) — P(t) = mdv + dm v ex . (3.4) 

If there is a net external force F ext (gravity, for instance), this change of momentum 
is F exl dt. (See Problem 3.11.) Here I shall assume that there are no external forces, 
so that P is constant and dP = 0. Therefore 


mdv = —dm v ex . 


(3.5) 


Dividing both sides by dt, we can rewrite this as 


mv = —mv tx (3.6) 

where —m is the rate at which the rocket’s engine is ejecting mass. This equation 
looks just like Newton’s second law (mv = F) for an ordinary particle, except that 
the product -mu ex on the right plays the role of the force. For this reason this product 
is often called the thrust: 


thrust = —mv ex . 


(3.7) 


(Since m is negative, this defines the thrust to be positive.) 

Equation (3.5) can be solved by separation of variables. Dividing both sides by m 
gives 


If the exhaust speed u ex is constant, this equation can be integrated to give 

v — v 0 = u ex ln (m Q /m) (3.8) 

where v 0 is the initial velocity and m 0 is the initial mass of the rocket (including fuel 
and payload). This result puts a significant restriction on the maximum speed of the 
rocket. The ratio mjm is largest when all the fuel is burned and m is just the mass 
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of rocket plus payload. Even if, for example, the original mass is 90% fuel, this ratio 
is only 10, and, since In 10 = 2.3, this says that the speed gained, v — v 0 , cannot be 
more than 2.3 times v ex . This means that rocket engineers try to make v ex as big as 
possible and also design multistage rockets, which can jettison the heavy fuel tanks 
of the early stages to reduce the total mass for later stages. 2 


3.3 The Center of Mass 


Several of the ideas of Section 3.1 can be rephrased in terms of the important notion of 
a system’s center of mass. Let us consider a group of N particles, ct = 1, • • •, N, with 
masses m a and positions r a measured relative to an origin O. The center of mass (or 
CM) of this system is defined to be the position (relative to the same origin O ) 



mjFj + • • • + triflYiy 

M 


(3.9) 


where M denotes the total mass of all of the particles, M = J2 m a - The first thing to 
note about this definition is that it is a vector equation. The CM position is a vector R 
with three components (X, Y, Z), and Equation (3.9) is equivalent to three equations 
giving these three components, 


N N 

X = — y"m a x a , f = 

M “ 01 M Z—f 


1 N 

Z = —y^m a z a . 


Either way, the CM position R is a weighted average of the positions r 1? • • • , r N , in 
which each position r a is weighted by the corresponding mass m a . (Equivalently, it 
is the sum of the r a , each multiplied by the fraction of the total mass at r a .) 

To get a feeling for the CM, it may help to consider the case of just two particles 
(N = 2). In this case, the definition (3.9) reads 


R= m ,r 1+ m 2 r 2 (3 .l 0) 

mj + m 2 

It is easy to verify that the CM position has several familiar properties. For example, 
you can show (Problem 3.18) that the CM defined by (3.10) lies on the line joining 
the two particles, as shown in Figure 3.3. It is also easy to show that the distances of 
the CM from m { and m 2 are in the ratio m 2 /m h so that the CM lies closer to the more 
massive particle. (In Figure 3.3 this ratio is 1/3.) In particular, if m x is much greater 
than m 2 , the CM will be very close to iq. More generally, going back to Equation 
(3.9) for the CM of N particles, we see that if m x is much greater than any of the 
other masses (as is the case for the sun as compared to all the planets), then M 
while m a <^M for all other particles; this means that R is very close to iq. Thus, for 
example, the CM of the solar system is very close to the sun. 


2 Jettisoning the fuel tanks of stage 1 reduces the inital and final masses of stage 2 by the same 
amount. This increases the ratio m 0 /m when we apply (3.8) to stage 2. See Problem 3.12. 
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Figure 3.3 The CM of two particles lies at the position 
R = (m i r l + m 2 r 2 )/M. You can prove that this lies on 
the line joining m, to m 2 , as shown, and that the distances 
of the CM from m l and m 2 are in the ratio m 2 /m x . 


We can now write the total momentum P of any N -particle system in terms of the 
system’s CM as follows: 


P = ^p a = ]Tm a r a = MR (3.11) 

where the last equality is just the derivative of the definition (3.9) of R (multiplied by 
M). This remarkable result says that the total momentum of the N particles is exactly 
the same as that of a single particle of mass M and velocity equal to that of the CM. 

We get an even more striking result when we differentiate (3.11). According to 
(3.1), the derivative of P is just F ext . Therefore, (3.11) implies that 

F ext - MR. (3.12) 

That is, the center of mass R moves exactly as if it were a single particle of mass M, 
subject to the net external force on the system. This result is the main reason why we 
can often treat extended bodies, such as baseballs and planets, as if they were point 
particles. Provided a body is small compared to the scale of its trajectory, its CM 
position R is a good representative of its overall location, and (3.12) implies that R 
moves just like a point particle. 

Given the importance of the CM, you need to feel comfortable calculating the CM 
position for various systems. You may have had plenty of practice in introductory 
physics or in a calculus course, but, in case you didn’t, there are several exercises at 
the end of this chapter. One important point to bear in mind is that when the mass 
in a body is distributed continuously, the sum in the definition (3.9) goes over to an 
integral 


R= — frdm = — f grdV (3.13) 

MJ M J 

where q is the mass density of the body, dV denotes an element of volume, and the 
integral runs over the whole body (that is, everywhere q ^ 0). We shall be using similar 
integrals to evaluate the moment of inertia tensor in Chapter 10. Meanwhile, here is 
one example: 
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example 3.2 The CM of a Solid Cone 

Find the CM position for the uniform solid cone shown in Figure 3.4. 

It is perhaps obvious by symmetry that the CM lies on the axis of symmetry 
(the z axis), but this also follows immediately from the integral (3.13). For 
example, if you consider the x component of that integral, it is easy to see that 
the contribution from any point (x,y,z) is exactly cancelled by that from the 
point (— jc, y, z). That is, the integral for X is zero. Because the same argument 
applies to Y, the CM lies on the z axis. To find the height Z of the CM, we must 
evaluate the integral 

Z = — f ozdV — — I zdxdydz 

MJ M J 

where I could take the factor q outside the integral since q is constant throughout 
the cone (as long as we understand the integral is limited to the inside of the cone) 
and I have changed the volume element dV to dx dy dz. For any given z, the 
integral over x and y runs over a circle of radius r = Rz/h, giving a factor of 
jtr 2 = nR 2 z 2 /h 2 , so that 


QTtR 2 f h z 3 dz _ QXR 2 h A _ 3 
Mh 2 Jo Mh 2 4 ~ 


4 


where in the last step I replaced the mass M by q times the volume or M — 
\Q7TR 2 h. We conclude that the CM is on the axis of the cone at a distance \h 
from the vertex (or \h from the base). 



Figure 3.4 A solid cone, centered on the z axis, with 
vertex at the origin and uniform mass density q. Its height 
is h and its base has radius R. 
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3.4 Angular Momentum for a Single Particle 


In many ways the conservation of angular momentum parallels the conservation of 
ordinary (or “linear”) momentum. Nevertheless, I would like to review the formalism 
in detail, first for a single particle and then for a multiparticle system. This will 
introduce several important ideas and some useful mathematics. 

The angular momentum i of a single particle is defined as the vector 

{ = rxp. (3.14) 

Here r x p is the vector product of the particle’s position vector r, relative to the 
chosen origin O, and its momentum p, as shown in Figure 3.5. Notice that because 
r depends on the choice of origin, the same is true of i\ The angular momentum 
l (unlike the linear momentum p) depends on the choice of origin, and we should, 
strictly speaking, refer to l as the angular momentum relative to O. 

The time rate of change of l is easily found: 

£ = ^-(r x p) = (r x p) + (r x p). (3.15) 

at 

(You can easily check that the product rule can be used for differentiating vector 
products, as long as you are careful to keep the vectors in the right order. See 
Problem 1.17.) In the first term on the right, we can replace p by rar, and, because the 
cross product of any two parallel vectors is zero, the first term is zero. In the second 
term, we can replace p by the net force F on the particle, and we get 

f = r x F = T. (3.16) 

Here T (Greek capital gamma) denotes the net torque about O on the particle, defined 
as r x F. (Other popular symbols for torque are r and N.) In words, (3.16) says that 
the rate of change of a particle’s angular momentum about the origin O is equal to the 
net applied torque about O. Equation (3.16) is the rotational analog of the equation 
p = F for the linear momentum, and (3.16) is often described as the rotational form 
of Newton’s second law. 



Figure 3.5 For any particle with position r relative to the origin 
O and momentum p, the angular momentum about O is defined 
as the vector { = rxp. For the case shown, l points into the 
page. 
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Figure 3.6 A planet (mass m) is subject to the central force 
of the sun (mass M). If we choose the origin at the sun, then 
r x F = 0, and the planet’s angular momentum about O is 
constant. 


In many one-particle problems one can choose the origin O so that the net torque 
T (about the chosen O) is zero. In this case, the particle’s angular momentum about 
O is constant. Consider, for example, a single planet (or comet) orbiting the sun. The 
only force on the planet is the gravitational pull GmM/r 2 of the sun, as shown in 
Figure 3.6. A crucial property of the gravitational force is that it is central, that is, 
directed along the line joining the two centers. This means that F is parallel (actually, 
antiparallel) to the position vector r measured from the sun, and hence that r x F = 0. 
Thus if we choose our origin at the sun, the planet’s angular momentum about O is 
constant, a fact that greatly simplifies the analysis of planetary motion. For example, 
because r x p is constant, r and p must remain in a fixed plane; in other words, the 
planet’s orbit is confined to a single plane containing the sun, and the problem is 
reduced to two dimensions, a result we shall exploit in Chapter 8. 


Kepler’s Second Law 

One of the earliest triumphs for Newton’s mechanics was that he was able to explain 
Kepler’s second law as a simple consequence of conservation of angular momentum. 
Newton’s laws of motion were published in 1687 in his famous book Principia. 
Nearly eighty years earlier, the German astronomer Johannes Kepler (1571-1630) 
had published his three laws of planetary motion. 3 These laws are quite different 
from Newton’s laws in that they are simply mathematical descriptions of the observed 
motion of the planets. For example, the first law states that the planets move around 
the sun in ellipses with the sun at one focus. Kepler’s laws make no attempt to explain 
planetary motion in terms of more fundamental ideas; they are just summaries — 
brilliant summaries, requiring great insight, but nonetheless just summaries — of the 
observed motions of the planets. All three of Kepler’s laws turn out to be consequences 
of Newton’s laws of motion. I shall derive the first and third of the Kepler laws in 
Chapter 8. The second we are ready to discuss now. 


3 Kepler’s first two laws appeared in his book Astronomia Nova in 1609 and the third in another 
book, Harmonices Mundi, published in 1619. 
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Figure 3.7 The orbit of a planet with the sun fixed at O . Kepler’s 
second law asserts that if the two pairs of points P, Q and P', Q' 
are separated by equal time intervals, dt — dt', then the two areas 
dA and dA' are equal. 


Kepler’s second law is generally stated something like this: 


Kepler’s Second Law 

As each planet moves around the sun, a line drawn from the planet to the sun 
sweeps out equal areas in equal times. 


This rather curious statement is illustrated in Figure 3.7, which shows the path of a 
planet or comet — the law applies to comets as well — orbiting about the sun at the 
origin O. (Throughout this discussion, I shall make the approximation that the sun is 
fixed; we shall see how to allow for the very small motion of the sun in Chapter 8.) 
The area “swept out” by the planet moving between any two points P and Q is just 
the area of the triangle OPQ. (Strictly speaking the “triangle” is the area between the 
two lines OP and OQ and the arc PQ. However, it is sufficient to consider pairs of 
points P and Q that are close together, in which case the difference between the arc 
P Q and the straight line P Q is negligible.) I shall denote the time elapsed between the 
planet’s visiting P and Q by dt and the corresponding area of OPQ by dA. Kepler’s 
second law asserts that if we choose any other pair of points P’ and Q’ separated by 
the same time interval (« dt' = dt), then the area OP'Q' will be the same as OPQ, or 
dA' = d A. Equivalently we can divide both sides of this equality by dt and assert that 
the rate at which the planet sweeps out area, dA/dt, is the same at all points on the 
orbit; that is, dA/dt is constant. 

To prove this result, we note first that the line OP is just the position vector r, 
and P Q is the displacement dr = v dt. Now, it is a well-known property of the vector 
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product that if two sides of a triangle are given by vectors a and b, then the area of 
the triangle is \ |a x b|. (See Problem 3.24.) Thus the area of the triangle OPQ is 

dA=\ |r x \dt |. 

Replacing v by p jm and dividing both sides by dt, we find that 


(LA 

dt 


= —|r x p| = —l 
2m 2m 


(3.17) 


where L denotes the magnitude of the angular momentum ( = rxp. Since the planet’s 
angular momentum about the sun is conserved, this establishes that d A /dt is constant, 
which, as we have seen, is the content of Kepler’s second law. 1 

An alternative proof of the same result adds some additional insight: It is a 
straightforward exercise to show that (Problem 3.27) 

l = mr 2 co (3.18) 

where co = <p is the planet’s angular velocity around the sun. And it is an equally 
simple geometrical exercise to show that the rate of sweeping out area is 


dA/dt = \r 2 co. (3.19) 

Comparison of (3.18) and (3.19) shows that l is constant if and only if dA/dt is 
constant. That is, conservation of angular momentum is exactly equivalent to Kepler’s 
second law. In addition, we see that as the planet (or comet) approaches closer to 
the sun (r decreasing) its angular velocity co necessarily increases. Specifically, co is 
inversely proportional to r 2 ; for example, if the value of r at point P' is half that at P, 
then the angular velocity co at P' is four times that at P. 

It is interesting to note that our proof of Kepler’s second law depended only on 
the fact that the gravitational force is central and hence that the planet’s angular 
momentum about the sun is constant. Thus Kepler’s second law is true for an object 
that moves under the influence of any central force. By contrast, we shall see in 
Chapter 8 that the first and third laws (in particular the first, which says that the orbits 
are ellipses with the sun at one focus) depend on the inverse-square nature of the 
gravitational force and are not true for other force laws. 


3.5 Angular Momentum for Several Particles 


Let us next discuss a system of N particles, a = 1, 2, • • •, N, each with its angular 
momentum l a = r a x p a (with all of the r a measured from the same origin O, of 
course). We define the total angular momentum L as 

N N 


L = = X P“' 


( 3 . 20 ) 
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Differentiating with respect to t and using the result (3.16), we find that 

L = ^4 = ]T ra x¥ a , (3.21) 

where, as usual, F a denotes the net force on particle a. This result shows that the rate 
of change of L is just the net torque on the whole system, an important result in its own 
right. However, my interest now is to separate the effects of the internal and external 
forces. As in Equation (1.25) we write F a as 

(net force on particle a) = F a = F a/? + F* xl (3.22) 

where, as before, F a p denotes the force exerted on particle a by particle ft, and F® xt 
is the net force exerted on particle a by all agents outside our A-particle system. 
Substituting into (3.21), we find that 

L=£E r « xF «<s + £r„x Ff. (3.23) 

a fca 

Equation (3.23) corresponds to Equation (1.27) in our discussion of linear momen¬ 
tum back in Chapter 1, and we can rework it in much the same way as there, with one 
interesting additional twist. We can regroup the terms of the double sum, pairing each 
term aft with the corresponding term fta, to give 4 

EE r - xF *» = £ D r * x F «« + ■> x F ««>- < 3 - 24) 

a pjza a p>a 

If we assume that all the internal forces obey the third law (F a ^ = —F^ a ), then we can 
rewrite the sum on the right as 

XJ V (3 ‘ 25) 


To understand this sum, we must examine the vector (r a — r^) = r aj8 , say. This is 
illustrated in Figure 3.8, where we see that r ajS is the vector pointing toward particle a 
from particle ft. If, in addition to satisfying the third law, the forces F ttj6 are all central, 
then the two vectors r a/3 and F a p point along the same line, and their cross product is 
zero. 

Returning to Equation (3.23), we conclude that, provided our various assumptions 
are valid, the double sum in (3.23) is zero. The remaining single sum is just the net 
external torque, and we conclude that 


l = r ext . 

In particular, if the net external torque is zero, we have the 


(3.26) 


4 Be sure you understand what has happened here. For example, I have paired the term rj xF 12 
with the term r 2 x F 21 . 
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Figure 3.8 The vector r aj8 = (r a — r^) points to particle a 
from particle ft. If the force F af) is central (points along the 
line joining a and >0), then r a p and F a p are collinear and their 
cross product is zero. 


Principle of Conservation of Angular Momentum 
If the net external torque on an iV-particle system is zero, the system's total 
angular momentum L = ]T x is constant. 


The validity of this principle depends on our two assumptions that all internal forces 
F a;6 are central and satisfy the third law. Since these assumptions are almost always 
valid, the principle (as stated) is likewise. It is of the greatest utility in solving many 
problems, as I shall illustrate shortly with a couple of simple examples. 


The Moment of Inertia 

Before discussing an example, it is worth noting that the calculation of angular 
momenta does not always require one to go back to the basic definition (3.20). As you 
probably recall from your introductory physics course, for a rigid body rotating about 
a fixed axis (for example, a wheel rotating on its fixed axle), the rather complicated 
sum (3.20) can be expressed in terms of the moment of inertia and the angular velocity 
of rotation. Specifically, if we take the axis of rotation to be the z axis, then L,, the z 
component of angular momentum, is just L z = Ico, where I is the moment of inertia 
of the body for the given axis, and co is the angular velocity of rotation. We shall prove 
and generalize this result in Chapter 10, or you can prove it yourself with the guidance 
of Problem 3.30. For now, I shall ask you to carry it over from introductory physics. 
In particular, as you may recall, the moments of inertia of various standard bodies 
are known. For example, for a uniform disk (mass M, radius R) rotating about its 
axis, / = \MR 2 . For a uniform solid sphere rotating about a diameter, / = \MR 2 . In 
general, for any multiparticle system, / = YL m aP «> where p a is the distance of the 
mass m a from the axis of rotation. 
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example 3.3 Collision of a Lump of Putty with a Turntable 

A uniform circular turntable (mass M, radius R, center O) is at rest in the xy 
plane and is mounted on a frictionless axle, which lies along the vertical z axis. 
I throw a lump of putty (mass m) with speed v toward the edge of the turntable, 
so it approaches along a line that passes within a distance 5 b of O, as shown in 
Figure 3.9. When the putty hits the turntable, it sticks to the edge, and the two 
rotate together with angular velocity co. Find co. 

This problem is easily solved using conservation of angular momentum. 
Because the turntable is mounted on a frictionless axle, there is no torque on 
the table in the z direction. Therefore the z component of the external torque 
on the system is zero, and L z is conserved. (This is true even if we include 
gravity, which acts in the z direction and contributes nothing to the torque in 
the z direction.) Before the collision, the turntable has zero angular momentum, 
while the putty has i — r x p, which points in the z direction. Thus the initial 
total angular momentum has z component 

Z/ n — i z = r(mv) sin# = mvb. 

After the collision, the putty and turntable rotate together about the z axis with 
total moment of inertia 6 I = (m + M/2) JR 2 , and the z component of the final 
angular momentum is L 6n = I co. Therefore, conservation of angular momentum 
in the form L 1 " = L 6n tells us that 

mvb = (m + M/2)R 2 co, 



Figure 3.9 A lump of putty of mass m is thrown with velocity 
v at a stationary turntable. The putty’s line of approach passes 
within the distance b of the table’s center O. 


5 In collision theory — the theory of collisions, usually between atomic or subatomic particles 
the distance b is called the impact parameter. 

6 This is mR 2 for the putty stuck at radius R plus \MR 2 for the uniform turntable. 
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or, solving for co, 


m vb 

(m + M/2) R 2 ' 


(3.27) 


This answer is not especially interesting. What is interesting is that we were able 
to find it with so comparatively little effort. This is typical of the conservation 
laws, that they can answer many questions so simply. The kind of analysis used 
here can be used in many situations (such as nuclear reactions) where an incident 
projectile is absorbed by a stationary target and its angular momentum is shared 
between the two bodies. 


Angular Momentum about the CM 

The conservation of angular momentum and the more general result (3.26), L = T ext , 
were derived on the assumption that all quantities were measured in an inertial frame, 
so that Newton’s second law could be invoked. This required that both L and T ext be 
measured about an origin O fixed in some inertial frame. Remarkably, the same two 
results also hold if L and T ext are measured about the center of mass — even if the 
CM is being accelerated and so is not fixed in an inertial frame. That is, 

—L(about CM) = T ext (about CM) (3.28) 

dt 

and hence, if T ext (about CM) = 0, then L (about CM) is conserved. We shall prove 
this result in Chapter 10, or you can prove it yourself with the guidance of Problem 
3.37.1 mention it now, because it allows a very simple solution to various problems, 
as the following example illustrates. 

example 3.4 A Sliding and Spinning Dumbbell 

A dumbbell consisting of two equal masses m mounted on the ends of a rigid 
massless rod of length 2b is at rest on a frictionless horizontal table, lying on 
the x axis and centered on the origin, as shown in Figure 3.10. At time t = 0, 
the left mass is given a sharp tap, in the shape of a horizontal force F in the y 
direction, lasting for a short time At. Describe the subsequent motion. 

There are actually two parts to this problem: We must find the initial motion j 
immediately after the impulse, and then the subsequent, force-free motion. The 
initial motion is not hard to guess, but let us derive it using the tools of this 
chapter. The only external force is the force F acting in the y direction for 
the brief time At. Since P = F ext , the total momentum just after the impulse 
is P = F At. Since P = MR (with M = 2m), we conclude that the CM starts 
moving directly up the y axis with velocity 

v cm = R = F At 12m. | 

While the force F is acting, there is a torque T ext = Fb about the CM, and so, 
according to (3.28), the initial angular momentum (just after the impulse has 
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Figure 3.10 The left mass of the dumbbell is given a sharp 
tap in the y direction. 

ceased) is L — Fb At. Since L = Ico, with I = 2 mb 2 , we conclude that the 
dumbbell is spinning clockwise, with initial angular velocity 

co = F At/2mb. 

The clockwise rotation of the dumbbell means that the left mass is moving 
up relative to the CM with speed cob, and its total initial velocity is 

Uieft — u cm + cob = F At/m. 

By the same token the right mass is moving down relative to the CM, and its 
total initial velocity is 

bright - ^cm ~C0b = 0. 

That is, the right mass is initially stationary, while the left one carries all the 
momentum F At of the system. 

The subsequent motion is very straightforward. Once the impulse has ceased, 
there are no external forces or torques. Thus the CM continues to move straight 
up the y axis with constant speed, and the dumbbell continues to rotate with 
constant angular momentum about the CM and hence constant angular velocity. 


Principal Definitions and Equations of Chapter 3 _ 

Equation of Motion for a Rocket 

mi) = —mv ex + F ext . [Eqs. (3.6) & (3.29)] 

The Center of Mass of Several Particles 

R = 1 Vm„r„ = j > _ [Eq . (3.9)] 

M _. M 


where M is the total mass of all particles, M = Yl m a- 
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Angular Momentum 

For a single particle with position r (relative to an origin O ) and momentum p, the 
angular momentum about O is 

f = rxp. [Eq. (3.14)] 

For several particles, the total angular momentum is 

N N 

L = J2 = J2 r« X Pa- [Eq. (3.20)] 

Provided all the internal forces are central, 

L = T ext [Eq. (3.26)] 

where T ext is the net external torque. 


Problems for Chapter 3 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (*★*). 

section 3.1 Conservation of Momentum 

3.1 * Consider a gun of mass M (when unloaded) that fires a shell of mass m with muzzle speed v. 
(That is, the shell’s speed relative to the gun is u.) Assuming that the gun is completely free to recoil (no 
external forces on gun or shell), use conservation of momentum to show that the shell’s speed relative 
to the ground is v/(l + m/M ). 

3.2 * A shell traveling with speed v Q exactly horizontally and due north explodes into two equal-mass 
fragments. It is observed that just after the explosion one fragment is traveling vertically up with speed 
v 0 . What is the velocity of the other fragment? 

3.3 ★ A shell traveling with velocity v 0 explodes into three pieces of equal masses. Just after the 
explosion, one piece has velocity Vj = v 0 and the other two have velocities v 2 and v 3 that are equal in 
magnitude (v 2 = i> 3 ) but mutually perpendicular. Find v 2 and v 3 and sketch the three velocities. 

3.4 *★ Two hobos, each of mass m h , are standing at one end of a stationary railroad flatcar with 
frictionless wheels and mass m ic . Either hobo can run to the other end of the flatcar and jump off 
with the same speed u (relative to the car), (a) Use conservation of momentum to find the speed of 
the recoiling car if the two men run and jump simultaneously, (b) What is it if the second man starts 
running only after the first has already jumped? Which procedure gives the greater speed to the car? 
[Hint: The speed u is the speed of either hobo, relative to the car just after he has jumped; it has the 
same value for either man and is the same in parts (a) and (b).] 

3.5 ★★ Many applications of conservation of momentum involve conservation of energy as well, and 
we haven’t yet begun our discussion of energy. Nevertheless, you know enough about energy from 
your introductory physics course to handle some problems of this type. Here is one elegant example: 
An elastic collision between two bodies is defined as a collision in which the total kinetic energy 
of the two bodies after the collision is the same as that before. (A familiar example is the collision 
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between two billiard balls, which generally lose extremely little of their total kinetic energy.) Consider 
an elastic collision between two equal mass bodies, one of which is initially at rest. Let their velocities 
be Vj and v 2 = 0 before the collision, and v' and \' 2 after. Write down the vector equation representing 
conservation of momentum and the scalar equation which expresses that the collision is elastic. Use 
these to prove that the angle between v' and \' 2 is 90°. This result was important in the history of atomic 
and nuclear physics: That two bodies emerged from a collision traveling on perpendicular paths was 
strongly suggestive that they had equal mass and had undergone an elastic collision. 

SECTION 3.2 Rockets 

3.6 * In the early stages of the Saturn V rocket’s launch, mass was ejected at about 15,000 kg/s, with 
a speed v ex ~ 2500 m/s relative to the rocket. What was the thrust on the rocket? Convert this to tons 
(1 ton & 9000 newtons) and compare with the rocket’s initial weight (about 3000 tons). 

3.7 * The first couple of minutes of the launch of a space shuttle can be described very roughly as 
follows: The initial mass is 2 x 10 6 kg, the final mass (after 2 minutes) is about 1 x 10 6 kg, the average 
exhaust speed D ex is about 3000 m/s, and the initial velocity is, of course, zero. If all this were taking 
place in outer space, with negligible gravity, what would be the shuttle’s speed at the end of this stage? 
What is the thrust during the same period and how does it compare with the initial total weight of the 
shuttle (on earth)? 

3.8 * A rocket (initial mass m 0 ) needs to use its engines to hover stationary, just above the ground, (a) If 
it can afford to burn no more than a mass Xm 0 of its fuel, for how long can it hover? [Hint: Write down 
the condition that the thrust just balance the force of gravity. You can integrate the resulting equation 
by separating the variables t and m. Take v ex to be constant.] (b) If v ex ~ 3000 m/s and X ~ 10%, for 
how long could the rocket hover just above the earth’s surface? 

3.9 * From the data in Problem 3.7 you can find the space shuttle’s initial mass and the rate of ejecting 
mass — m (which you may assume is constant). What is the minimum exhaust speed v cx for which the 
shuttle would just begin to lift as soon as bum is fully underway? [Hint: The thrust must at least balance 
the shuttle’s weight.] 

3.10 * Consider a rocket (initial mass m 0 ) accelerating from rest in free space. At first, as it speeds up, 
its momentum p increases, but as its mass m decreases p eventually begins to decrease. For what value 
of m is p maximum? 

3.11 ** (a) Consider a rocket traveling in a straight line subject to an external force F ext acting along 
the same line. Show that the equation of motion is 

mi) = -mv ex + F ext . (3.29) 

[Review the derivation of Equation (3.6) but keep the external force term.] (b) Specialize to the case of 
a rocket taking off vertically (from rest) in a gravitational field g, so the equation of motion becomes 
mi) s= —rhv ex — mg. (3.30) 

Assume that the rocket ejects mass at a constant rate, m = —k (where k is a positive constant), so 
that m = ra 0 — kt. Solve equation (3.30) for v as a function of t, using separation of variables (that 
is, rewriting the equation so that all terms involving v are on the left and all terms involving t on the 
right), (c) Using the rough data from Problem 3.7, find the space shuttle’s speed two minutes into flight, 
assuming (what is nearly true) that it travels vertically up during this period and that g doesn’t change 
appreciably. Compare with the corresponding result if there were no gravity, (d) Describe what would 
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happen to a rocket that was designed so that the first term on the right of Equation (3.30) was smaller 
than the initial value of the second. 

3.12 ** To illustrate the use of a multistage rocket consider the following: (a) A certain rocket carries 
60% of its initial mass as fuel. (That is, the mass of fuel is 0.6m o .) What is the rocket’s final speed, 
accelerating from rest in free space, if it bums all its fuel in a single stage? Express your answer as a 
multiple of u ex . (b) Suppose instead it bums the fuel in two stages as follows: In the first stage it burns 
a mass 0.3m o of fuel. It then jettisons the first-stage fuel tank, which has a mass of 0.1m o , and then 
bums the remaining 0.3m o of fuel. Find the final speed in this case, assuming the same value of v ex 
throughout, and compare. 

3.13** If you have not already done it, do Problem 3.11(b) and find the speed v(t) of a rocket 
accelerating vertically from rest in a gravitational field g. Now integrate v(t) and show that the rocket’s 
height as a function of t is 


y(t) = u ex t 



Using the numbers given in Problem 3.7, estimate the space shuttle’s height after two minutes. 

3.14 ★* Consider a rocket subject to a linear resistive force, f = —by, but no other external forces. Use 
Equation (3.29) in Problem 3.11 to show that if the rocket starts from rest and ejects mass at a constant 
rate k = — m, then its speed is given by 



section 3.3 The Center of Mass 

3.15 * Find the position of the center of mass of three particles lying in the xy plane at r, = (1, 1, 0), 
r 2 = (1, —1, 0), and r 3 = (0, 0, 0), if m 1 = m 2 and m 3 = lOm^ Illustrate your answer with a sketch 
and comment. 

3.16* The masses of the earth and sun are M e ^ 6.0 x 10 24 and M s ~ 2.0 x 10 30 (both in kg) and 
their center-to-center distance is 1.5 x 10 8 km. Find the position of their CM and comment. (The radius 
of the sun is R s ^ 7.0 x 10 5 km.) 

3.17 * The masses of the earth and moon are M e & 6.0 x 10 24 and M m ~ 7.4 x 10 22 (both in kg) and 
their center to center distance is 3.8 x 10 5 km. Find the position of their CM and comment. (The radius 
of the earth is R e « 6.4 x 10 3 km.) 

3.18 ** (a) Prove that the CM of any two particles always lies on the line joining them, as illustrated 
in Figure 3.3. [Write down the vector that points from m x to the CM and show that it has the same 
direction as the vector from m l to m 2 .] (b) Prove that the distances from the CM to m x and m 2 are in 
the ratio m 2 /m x . Explain why if m l is much greater than m 2 , the CM lies very close to the position of 

3.19 ** (a) We know that the path of a projectile thrown from the ground is a parabola (if we ignore air 
resistance). In the light of the result (3.12), what would be the subsequent path of the CM of the pieces 
if the projectile exploded in midair? (b) A shell is fired from level ground so as to hit a target 100 m 
away. Unluckily the shell explodes prematurely and breaks into two equal pieces. The two pieces land 
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at the same time, and one lands 100 m beyond the target. Where does the other piece land? (c) Is the 
same result true if they land at different times (with one piece still landing 100 m beyond the target)? 

3.20 ★* Consider a system comprising two extended bodies, which have masses M l and M 2 and centers 
of mass at R[ and R 2 . Prove that the CM of the whole system is at 

u _ M)R| + m 2 r 2 

M, + M 2 

This beautiful result means that in finding the CM of a complicated system, you can treat its component 
parts just like point masses positioned at their separate centers of mass — even when the component 
parts are themselves extended bodies. 

3.21 ** A uniform thin sheet of metal is cut in the shape of a semicircle of radius R and lies in the xy 
plane with its center at the origin and diameter lying along the x axis. Find the position of the CM using 
polar coordinates. [In this case the sum (3.9) that defines the CM position becomes a two-dimensional 
integral of the form / r a dA where o denotes the surface mass density (mass/area) of the sheet and 
dA is the element of area dA = r dr d<p.] 

3.22 ★* Use spherical polar coordinates r, 0, <p to find the CM of a uniform solid hemisphere of radius 
R, whose flat face lies in the xy plane with its center at the origin. Before you do this, you will need to 
convince yourself that the element of volume in spherical polars is dV = r 2 dr sin 0 d9 d(f>. (Spherical 
polar coordinates are defined in Section 4.8. If you are not already familiar with these coordinates, you 
should probably not try this problem yet.) 

3.23 **★ [Computer] A grenade is thrown with initial velocity v G from the origin at the top of a high 
cliff, subject to negligible air resistance, (a) Using a suitable plotting program, plot the orbit, with 
the following parameters: v 0 = (4, 4), g = 1, and 0 < t < 4 (and with x measured horizontally and y 
vertically up). Add to your plot suitable marks (dots or crosses, for example) to show the positions 
of the grenade at t = 1, 2, 3, 4. (b) At t = 4, when the grenade’s velocity is v, it explodes into two 
equal pieces, one of which moves off with velocity v + Av. What is the velocity of the other piece? 
(c) Assuming that Av = (1, 3), add to your original plot the paths of the two pieces for 4 < t < 9. Insert 
marks to show their positions at t = 5,6,7, 8, 9. Find some way to show clearly that the CM of the two 
pieces continues to follow the original parabolic path. 

section 3.4 Angular Momentum for a Single Particle 

3.24 * If the vectors a and b form two of the sides of a triangle, prove that \ |a x b| is equal to the area 
of the triangle. 

3.25 ★ A particle of mass m is moving on a frictionless horizontal table and is attached to a massless 
string, whose other end passes through a hole in the table, where I am holding it. Initially the particle 
is moving in a circle of radius r 0 with angular velocity co 0 , but I now pull the string down through the 
hole until a length r remains between the hole and the particle. What is the particle’s angular velocity 
now? 

3.26* A particle moves under the influence of a central force directed toward a fixed origin O. 
(a) Explain why the particle’s angular momentum about O is constant, (b) Give in detail the argument 
that the particle’s orbit must lie in a single plane containing O. 

3.27 +* Consider a planet orbiting the fixed sun. Take the plane of the planet’s orbit to be the xy plane, 
with the sun at the origin, and label the planet’s position by polar coordinates (r, <j>). (a) Show that the 
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planet’s angular momentum has magnitude i = mr 2 oo, where oo = 0 is the planet’s angular velocity 
about the sun. (b) Show that the rate at which the planet “sweeps out area” (as in Kepler’s second law) 
is dA/dt = \r 2 oo, and hence that dA/dt = t/2m. Deduce Kepler’s second law. 

section 3.5 Angular Momentum for Several Particles 

3.28 * For a system of just three particles, go through in detail the argument leading from (3.20) to 
(3.26), L = r ext , writing out all the summations explicitly. 

3.29 * A uniform spherical asteroid of radius R 0 is spinning with angular velocity oo Q . As the aeons go 
by, it picks up more matter until its radius is R. Assuming that its density remains the same and that the 
additional matter was originally at rest relative to the asteroid (anyway on average), find the asteroid’s 
new angular velocity. (You know from elementary physics that the moment of inertia is |M/? 2 .) What 
is the final angular velocity if the radius doubles? 

3.30 ** Consider a rigid body rotating with angular velocity co about a fixed axis. (You could think of 
a door rotating about the axis defined by its hinges.) Take the axis of rotation to be the z axis and use 
cylindrical polar coordinates p a ,(p a , z a to specify the positions of the particles a = 1, • • •, N that make 
up the body, (a) Show that the velocity of the particle a is p a co in the 0 direction, (b) Hence show that 
the z component of the angular momentum l a of particle a is m a p 2 co. (c) Show that the z component 
L z of the total angular momentum can be written as L z = loo where I is the moment of inertia (for the 
axis in question), 

N 

I = J]m a p 2 . (3.31) 

a=l 

3.31 ** Find the moment of inertia of a uniform disc of mass M and radius R rotating about its axis, 
by replacing the sum (3.31) by the appropriate integral and doing the integral in polar coordinates. 

3.32 ** Show that the moment of inertia of a uniform solid sphere rotating about a diameter is | MR 2 . 
The sum (3.31) must be replaced by an integral, which is easiest in spherical polar coordinates, with 
the axis of rotation taken to be the z axis. The element of volume is dV = r 2 dr sin 9 d9 d<p. (Spherical 
polar coordinates are defined in Section 4.8. If you are not already familiar with these coordinates, you 
should probably not try this problem yet.) 

3.33** Starting from the sum (3.31) and replacing it by the appropriate integral, find the moment 
of inertia of a uniform thin square of side 2b, rotating about an axis perpendicular to the square and 
passing through its center. 

3.34 ★* A juggler is juggling a uniform rod one end of which is coated in tar and burning. He is holding 
the rod by the opposite end and throws it up so that, at the moment of release, it is horizontal, its CM 
is traveling vertically up at speed v 0 and it is rotating with angular velocity co Q . To catch it, he wants 
to arrange that when it returns to his hand it will have made an integer number of complete rotations. 
What should v Q be, if the rod is to have made exactly n rotations when it returns to his hand? 

3.35 ** Consider a uniform solid disk of mass M and radius R, rolling without slipping down an 
incline which is at angle y to the horizontal. The instantaneous point of contact between the disk and 
the incline is called P. (a) Draw a free-body diagram, showing all forces on the disk, (b) Find the linear 
acceleration v of the disk by applying the result L — T ext for rotation about P. (Remember that L — loo 
and the moment of inertia for rotation about a point on the circumference is | MR 2 . The condition that 
the disk not slip is that v = Roo and hence v = R<b.) (c) Derive the same result by applying L = T ext 
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to the rotation about the CM. (In this case you will find there is an extra unknown, the force of friction. 
You can eliminate this by applying Newton’s second law to the motion of the CM. The moment of 
inertia for rotation about the CM is \MR 2 .) 

3.36 ** Repeat the calculations of Example 3.4 (page 97) for the case that the force F acts in a 
“northeasterly” direction at angle y from the x axis. What are the velocities of the two masses just 
after the impulse has been applied? Check your answers for the cases that y = 0 and y = 90°. 

3.37 A system consists of N masses m a at positions r a relative to a fixed origin O. Let denote 
the position of m a relative to the CM; that is, r' a = r a — R. (a) Make a sketch to illustrate this last 
equation, (b) Prove the useful relation that ]T = 0. Can you explain why this relation is nearly 
obvious? (c) Use this relation to prove the result (3.28) that the rate of change of the angular momentum 
about the CM is equal to the total external torque about the CM. (This result is surprising since the CM 
may be accelerating, so that it is not necessarily a fixed point in any inertial frame.) 



CHAPTER 


Energy 


This chapter takes up the conservation of energy. You will see that the analysis of 
energy conservation is surprisingly more complicated than the corresponding discus¬ 
sions of linear and angular momenta in Chapter 3. The main reason for the difference 
is this: In almost all problems of classical mechanics there is only one kind of linear 
momentum (p = m\ for each particle), and one kind of angular momentum (£ = r x p 
for each particle). By contrast, energy comes in many different and important forms: 
kinetic, several kinds of potential, thermal, and more. It is the processes that trans¬ 
form energy from one kind to another that complicate the use of energy conservation. 
We shall see that conservation of energy is a quite subtle business, even for a system 
consisting of just a single particle. 

One manifestation of the relative difficulty of the discussion of energy is that we 
shall need some new tools from vector calculus, namely, the concepts of the gradient 
and the curl. I shall introduce these important ideas as we need them. 


4.1 Kinetic Energy and Work 


As I have said, there are many different kinds of energy. Perhaps the most basic is 
kinetic energy (or KE), which for a single particle of mass m traveling with speed v 
is defined to be 


T — jtnv 2 . (4.1) 

Let us imagine the particle moving through space and examine the change in its kinetic 
energy as it moves between two neighboring points r x and r { + dr on its path as shown 
in Figure 4.1. The time derivative of T is easily evaluated if we note that v 2 = v • v, 
so that 


dT 

dt 


= jm — (\ • v) — jm(\ *v + v *v) = mv *v. 


(4.2) 
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Figure 4.1 Three points on the path of a particle: 
r 1; r, + dr (with dr infinitesimal) and r 2 . 


By the second law, the factor m\ is equal to the net force F on the particle, so that 


dT 

— = F -v. 
dt 


(4.3) 


If we multiply both sides by dt, then since v dt is the displacement dr we find 


dT = ¥-dr. (4.4) 

The expression on the right, F • dr, is defined to be the work done by the force F in 
the displacement dr. Thus we have proved the Work-KE theorem, that the change 
in the particle’s kinetic energy between two neighboring points on its path is equal to 
the work done by the net force as it moves between the two points. 1 

So far we have proved the Work-KE theorem only for an infinitesimal displace¬ 
ment dr, but it generalizes easily to larger displacements. Consider the two points 
shown as r t and r 2 in Figure 4.1. We can divide the path between these points 1 and 
2 into a large number of very small segments, to each of which we can apply the in¬ 
finitesimal result (4.4). Adding all of these results, we find that the total change in T 
going from 1 to 2 is the sum F • dr of all the infinitesimal works done in all the 
infinitesimal displacements between points 1 and 2: 

AT = T 2 - T l = J2 F 'dr- (4-5) 

In the limit that all the displacements dr go to zero, this sum becomes an integral: 

C 2 

J]F.dr-> / F - dr. (4.6) 


1 Two points that can be puzzling at first: The work F • dr can be negative, if for example 
F and dr point in opposite directions. While the notion of a force doing negative work conflicts 
with our everyday notion of work, it is perfectly consisent with the physicist’s definition: A force 
in the opposite direction to the displacement reduces the KE, so, by the work-KE theorem, the 
corresponding work has to be negative. Second, if F and dr are perpendicular, then the work F • dr 
is zero. Again this conflicts with our everyday sense of work, but is consistent with the physicist’s 
usage: A force that is perpendicular to the displacement does not change the KE. 
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This integral, called a line integral, 2 is a generalization of the integral f f(x)dx over 
a single variable x, and its definition as the limit of the sum of many small pieces 
is closely analogous. If you feel any doubt about the symbol ff F • dr on the right 
of (4.6), think of it as being just the sum on the left (with all the displacements 
infinitesimally small). In evaluating a line integral, it is usually possible to convert 
it into an ordinary integral over a single variable, as the following examples show. 
Notice that, as the name implies, the line integral depends (in general) on the path 
that the particle followed from point 1 to point 2. The particular line integral on the 
right of (4.6) is called the work done by the force F moving between points 1 and 2 
along the path concerned. 


example 4.i Three Line Integrals 

Evaluate the line integral for the work done by the two-dimensional force 
F = (y, 2x) going from the origin O to the point P = (1,1) along each of the 
three paths shown in Figure 4.2. Path a goes from O to Q = (1, 0) along the x 
axis and then from Q straight up to P, path b goes straight from O to P along 
the line y = x, and path c goes round a quarter circle centered on Q. 

The integral along path a is easily evaluated in two parts, if we note that 
on OQ the displacements have the form dr = (dx, 0), while on QP they are 
dr = (0, dy). Thus 

W a = [ F 'dr = f Q F'dr+ f F-dr = f F x (x,0)dx + f F y {\,y)dy 

Ja Jo Jq Jo Jo 

= 0 + 2 f dy = 2. 

Jo 



Figure 4.2 Three different paths, a, b, and c, from the 
origin to the point P = (1, 1). 


2 Not an especially happy name for those of us who think of a line as something straight. However, 
there are curved lines as well as straight lines, and in general a line integral can involve a curved 
line, such as the path shown in Figure 4.1. 
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On the path b,x = y, so that dx = dy, and 


W f 




(F x dx + F y dy) - 


(x + 2x)dx = 1.5. 


Path c is conveniently expressed parametrically as 

r = (x, >’) = (1 — cos#, sin#) 

where # is the angle between OQ and the line from Q to the point (x, y), with 
0 < # < tt/2. Thus on path c 

dr = ( dx,dy ) = (sin#, cos 0)dd 
and 

W c = Jf- dr = J(F X dx + F y dy) 

= J j^sin 2 # + 2(1 - cos#)cos#j d0 = 2 - 7r/4 = 1.21. 


Some more examples can be found in Problems 4.2 and 4.3 and, if you have never 
studied line integrals, you may want to try some of these. 

With the notation of the line integral, we can rewrite the result (4.5) as 


AT s T) ■ 


F-rfrs W(l-*2) 


(4.7) 


where I have introduced the notation W(1 -» 2) for the work done by F moving from 
point 1 to point 2. The result is the Work-KE theorem for arbitrary displacements, 
large or small: The change in a particle’s KE as it moves between points 1 and 2 is 
the work done by the net force. 

It is important to remember that the work that appears on the right of (4.7) is the 
work done by the net force F on the particle. In general, F is the vector sum of various 
separate forces 

F = F! + ••• + F„ = ]Tf f . 

(=i 

(For example, the net force on a projectile is the sum of two forces, the weight and 
air resistance.) It is a most convenient fact that to evaluate the work done by the net 
force F, we can simply add up the works done by the separate forces F ls • • •, F„. This 
claim is easily proved as follows: 

W(\ 2 ) = ^ F-dr = J^ Y^ F i' dr 

= J2j x F i ’dr = ^ W;(l -> 2). 


(4.8) 
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The crucial step, from the first line to the second, is justified because the integral of 
a sum of n terms is the same as the sum of the n individual integrals. The Work-KE 
theorem can therefore be rewritten as 

T 2 -T x = Y^ Wid 2). (4.9) 

(=i 

In practice, one almost always uses the theorem in this way: Calculate the work W t 
done by each of the n separate forces on the particle and then set AT equal to the sum 
of all the W h 

If the net force on a particle is zero, then the Work-KE theorem tells us that the 
particle’s kinetic energy is constant. This simply says that the speed v is constant, 
which, though true, is not very interesting, since it already follows from Newton’s 
first law. 


4.2 Potential Energy and Conservative Forces 


The next step in the development of the energy formalism is to introduce the notion 
of potential energy (or PE) corresponding to the forces on an object. As you probably 
recall, not every force lends itself to the definition of a corresponding potential energy. 
Those special forces that do have a corresponding potential energy (with the required 
properties) are called conservative forces, and we must discuss the properties that 
distinguish conservative from nonconservative forces. Specifically, we shall find that 
there are two conditions that a force must satisfy to be considered conservative. 

To simplify our discussion, let us assume at first that there is only one force acting 
on the object of interest — the gravitational force on a planet by its sun, or the electric 
force qE on a charge in an electric field (with no other forces present). The force F may 
depend on many different variables: It may depend on the object’s position r. (The 
farther the planet is from the sun, the weaker the gravitational pull.) It may depend 
on the object’s velocity, as is the case with air resistance; and it may depend on the 
time t, as would be the case for a charge in a time-varying electric field. Finally, if the 
force is exerted by humans, it will depend on a host of imponderables — how tired 
they are feeling, how conveniently they are situated to push, and so on. 

The first condition for a force F to be conservative is that F depends only on the 
position r of the object on which it acts; it must not depend on the velocity, the time, or 
any variables other than r. This sounds, and is, quite restrictive, but there are plenty of 
forces that have this property: The gravitational force of the sun on a planet (position 
r relative to the sun) can be written as 


which evidently depends only on the variable r. (The parameters G,m,M are constant 
for a given planet and given sun.) Similarly, the electrostatic force F(r) = #E(r) on a 
charge q by a static electric field E(r) has this property. Forces that do not satisfy this 
condition include the force of air resistance (which depends on the velocity), friction 
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Figure 4.3 Three different paths, a, b, and c, joining the 
same two points 1 and 2. 


(which depends on the direction of motion), the magnetic force (which depends on 
the velocity), and the force of a time-varying electric field E(r, t) (which obviously 
depends on time). 

The second condition that a force must satisfy to be called conservative concerns 
the work done by the force as the object on which it acts moves between two points 
rj and r 2 (or just 1 and 2 for short), 

W(1^2) = F-dr. (4.10) 

Figure 4.3 shows two points, 1 and 2, and three different paths connecting them. It is 
entirely possible that the work done between points 1 and 2, as defined by the integral 
(4.10), has different values depending on which of the three paths, a, b, or c, the 
particle happens to follow. For example, consider the force of sliding friction as I 
push a heavy crate across the floor. This force has a constant magnitude, F fric say, and 
is always opposite to the direction of motion. Thus the work done by friction as the 
crate moves from 1 to 2 is given by (4.10) to be 

Wted -* 2) = -/W., 

where L denotes the length of the path followed. The three paths of Figure 4.3 have 
different lengths, and Wf ric (l -» 2) will have a different value for each of the three 
paths. 

On the other hand, there are forces with the property that the work W(1 2) is 

the same for any path connecting the same two points 1 and 2. An example of a force 
with this property is the gravitational force, F grav = mg, of the earth on an object close 
to the earth’s surface. It is easy to show (Problem 4.5) that, because g is a constant 
vector pointing vertically down, the work done in this case is 

Wgravd —> 2) = —mgh, (4.11) 

where h is just the vertical height gained between points 1 and 2. This work is the 
same for any two paths between the given points 1 and 2. This property, the path 



Section 4.2 Potential Energy and Conservative Forces 


111 


independence of the work it does, is the second condition that a force must satisfy to 
be considered conservative, and we are now ready to state the two conditions: 


Conditions for a Force to be Conservative 
A force F acting on a particle is conservative if and only if it satisfies two 
conditions: 

(i) F depends only on the particle’s position r (and not on the velocity v, or 
the time t, or any other variable); that is, F = F(r). 

(ii) For any two points 1 and 2, the work W(l -»• 2) done by F is the same 
for all paths between 1 and 2. 


The reason for the name “conservative” and for the importance of the concept is 
this: If all forces on an object are conservative, we can define a quantity called the 
potential energy (or just PE), denoted U(r), a function only of position, with the 
property that the total mechanical energy 

E — KE + PE = T + C/(r) (4.12) 

is constant; that is, E is conserved. 

To define the potential energy U (r) corresponding to a given conservative force, 
we first choose a reference point r 0 at which U is defined to be zero. (For example, 
in the case of gravity near the earth’s surface, we often define U to be zero at ground 
level.) We then define U (r), the potential energy at an arbitrary point r, to be 3 


U (r) = —W(r 0 -+ r) 


> - f F(r') 

Jr 0 


• dr'. 


(4.13) 


In words, U (r) is minus the work done by F if the particle moves from the reference 
point r G to the point of interest r, as in Figure 4.4. (We shall see the reason for the 
minus sign shortly.) Notice that the definition (4.13) only makes sense because of the 
property (ii) of conservative forces. If the work integral in (4.13) were different for 
different paths, then (4.13) would not define a unique function 4 U (r). 


3 Notice that I have called the variable of integration r' to avoid confusion with the upper limit r. 

4 The definition (4.13) also depends on property (i) of conservative forces, but in a slightly subtler 
way. If F depended on another variable besides r (for instance, t or v), then the right side of (4.13) 
would depend on when or how the particle moved from r Q to r, and again there would be no uniquely 
defined U(r). 
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Figure 4.4 The potential energy U (r) at any point r is de¬ 
fined as minus the work done by F if the particle moves from 
the reference point r 0 to r. This gives a well-defined function 
U (r) only if this work is independent of the path followed — 
that is, the force is conservative. 


example 4.2 Potential Energy of a Charge in a Uniform Electric Field 

A charge q is placed in a uniform electric field pointing in the x direction with 
strength E 0 , so that the force on q is F = qF = qE 0 x. Show that this force is 
conservative and find the corresponding potential energy. 

The work done by F going between any two points 1 and 2 along any path is 


W{ l-> 2) 


= F - dr — qE 0 x-dr = qE 0 dx = qE 0 (x 2 — (4.14) 

This depends only on the two end points 1 and 2. (In fact it depends only on their 
x coordinates x l and x 2 .) Certainly, it is independent of the path, and the force is 
conservative. To define the corresponding potential energy U (r), we must first 
pick a reference point r 0 at which U will be zero. A natural choice is the origin, 
r 0 = 0, in which case the potential energy is U (r) = — W (0 r) or, according 
to (4.14), 


U( r) = ~qE 0 x. 


We can now derive a crucial expression for the work done by F in terms of the 
potential energy U (r). Let rj and r 2 be any two points as in Figure 4.5. If r 0 is the 
reference point at which U is zero, then it is clear from Figure 4.5 that 

W(r 0 -> r 2 ) = W(r 0 r,) + W( r x -* r 2 ) 


and hence 


W( Fl -* r 2 ) = W(r 0 -* r 2 ) - W(r 0 Fl ). (4.15) 

Each of the two terms on the right is (minus) the potential energy at the corresponding 
point. Thus we have proved that the work on the left is just the difference of these two 
potential energies: 


W( ri -* r 2 ) = —[U (r 2 ) - [/(i^)] = -A U. 


(4.16) 
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Figure 4.5 The work —> r 2 ) going from rj to r 2 is the 

same as W (r 0 r 2 ) minus W (r G -> rj). This result is inde¬ 

pendent of what path we use for either limb of the journey, 
provided the force concerned is conservative. 


The usefulness of this result emerges when we combine it with the Work-KE 
theorem (4.7): 


AT = W( rj -> r 2 ). (4.17) 

Comparing this with (4.16), we see that 


AT = -AU (4.18) 

or, moving the right side across to the left, 5 

A(r + t/) = 0. (4.19) 

That is, the mechanical energy 

E = T + U (4.20) 

does not change as the particle moves from to r 2 . Since the points r, and r 2 were any 
two points on the particle’s trajectory, we have the important conclusion: If the force 
on a particle is conservative, then the particle’s mechanical energy never changes; 
that is, the particle’s energy is conserved, which explains the use of the adjective 
“conservative.” 


Several Forces 

So far we have established the conservation of energy for a particle subject to a single 
conservative force. If the particle is subject to several forces, all of them conservative, 
our result generalizes easily. For instance, imagine a mass suspended from the ceiling 
by a spring. This mass is subject to two forces, the forces of gravity (F grav ) and the 
spring (F spr ). The force of gravity is certainly conservative (as I’ve already argued), 
and, provided the spring obeys Hooke’s law, F spr is likewise (see Problem 4.42). We 


5 We now see the reason for the minus sign in the definition of U. It gives the minus sign on the 
right of (4.18), which in turn gives the desired plus sign on the left of (4.19). 
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can define separate potential energies for each force, £/ grav for F grav and f/ spr for F spr , 
each with the crucial property (4.16) that the change in U gives (minus) the work done 
by the corresponding force. According to the Work-KE theorem, the change in the 
mass’s kinetic energy is 

Ar = + w spr 

= -(At/ gtav + At/, pr ), (4.21) 

where the second line follows from the properties of the two separate potential 
energies. Rearranging this equation, we see that A (T + £/ grav + t/ spr ) = 0. That is, 
the total mechanical energy, defined as E = T + £/ grav + U spt , is conserved. 

The argument just given extends immediately to the case of n forces on a particle, 
so long as they are all conservative. If for each force F ( we define a corresponding 
potential energy U t , then we have the 


Principle of Conservation of Energy for One Particle 
If all of the n forces F f (/=!,•••,«) acting on a particle are conservative, each 
with its corresponding potential energy U^r), the total mechanical energy , 
defined as 

E = T + UssT + £/{<r) + ••• + U n ( r), (4.22) 

is constant in time. 


Nonconservative Forces 

If some of the forces on our particle are nonconservative, then we cannot define corre¬ 
sponding potential energies; nor can we define a conserved mechanical energy. Nev¬ 
ertheless, we can define potential energies for all of the forces that are conservative, 
and then recast the Work-KE theorem in a form that shows how the nonconservative 
forces change the particle’s mechanical energy. First, we divide the net force on the 
particle into two parts, the conservative part F cons and the nonconservative part F nc . 
For F cons we can define a potential energy, which we’ll call just U. By the Work-KE 
theorem, the change in kinetic energy between any two times is 

AT = W = W cons + W nc . (4.23) 

The first term on the right is just —A U and can be moved to the left side to give 
A (T + U) = W nc . If we define the mechanical energy as E = T + U, then we see 
that 


AE = A(T + U) = W nc . (4.24) 

Mechanical energy is no longer conserved, but we have the next best thing. The 
mechanical energy changes to precisely the extent that the nonconservative forces 
do work on our particle. In many problems the only nonconservative force is the force 



Section 4.2 Potential Energy and Conservative Forces 


115 


of sliding friction, which usually does negative work. (The frictional force f is in 
the direction opposite to the motion, so the work f ‘dr is negative.) In this case W nc 
is negative and (4.24) tells us that the object loses mechanical energy in the amount 
“stolen” by friction. All of these ideas are illustrated by the following simple example. 


example 4.3 Block Sliding Down an Incline 

Consider again the block of Example 1.1 and find its speed v when it reaches 
the bottom of the slope, a distance d from its starting point. 

The setup and the forces on the block are shown in Figure 4.6. The three 
forces on the block are its weight, w = mg, the normal force of the incline, 
N, and the frictional force f, whose magnitude we found in Example 1.1 to be 
/ = iimg cos 0. The weight mg is conservative, and the corresponding potential 
energy is (as you certainly recall from introductory physics, but see Problem 4.5) 

U = mgy 

where y is the block’s vertical height above the bottom of the slope (if we 
choose the zero of PE at the bottom). The normal force does no work, since it is 
perpendicular to the direction of motion, so will not contribute to the energy 
balance. The frictional force does work Wf ric = -fd = —fimgd cos 6. The 
change in kinetic energy is AT = T f — T { = ~mv 2 and the change in potential 
energy is A U = U { — U l = —mgh = —mgd sin#. Thus (4.24) reads 

AT + AU = w fric 

or 

jmv 2 — mgd sin 6 = —firngd cos 9. 

Solving for v we find 

t; = ^2gd(sm6 — fi cos 6). 



Figure 4.6 A block on an incline of angle 0. The length 
of the slope is d, and the height is h — d sin 0. 
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S As usual, you should check that this answer agrees with common sense. For 
| example, does it give the expected answer when 9 = 90° ? What about 9 = 0 ? 
j (The case 9 = 0 is a bit subtler.) 


4.3 Force as the Gradient of Potential Energy 


We have seen that the potential energy U( r) corresponding to a force F(r) can be 
expressed as an integral of F(r) as in (4.13). This suggests that we should be able to 
write F(r) as some kind of derivative of U (r). This suggestion proves correct, though 
to implement it we shall need some mathematics that you may not have met before. 
Specifically, since F(r) is a vector [while U (r) is a scalar] we shall be involved in 
some vector calculus. 

Let us consider a particle acted on by a conservative force F(r), with corresponding 
potential energy U (r), and examine the work done by F(r) in a small displacement 
from r to r + dr. We can evaluate this work in two ways. On the one hand, it is, by 
definition, 

W(r-+r+dr)=F(r)-dr 

= F x dx + F y dy + F z dz, (4.25) 

for any small displacement dr with components ( dx, dy, dz). 

On the other hand, we have seen that the work W(r r+ dr) is the same as 

(minus) the change in PE in the displacement: 

W(r -> r+dr) = -dU = ~[U(r + dr) - U{ r)] 

= — [U (x + dx, y + dy, z + dz) — U (x, y, z)]. (4.26) 

In the second line, I have replaced the position vector r by its components to emphasize 
that U is really a function of the three variables ( x,y,z). Now, for functions of one 
variable, a difference like that in (4.26) can be expressed in terms of the derivative: 

df = f{x + dx) - fix) = d -fdx. (4.27) 

dx 

This is really no more than the definition of the derivative. 6 For a function of three 
variables, such as U (x, y, z), the corresponding result is 


dU = Uix + dx, y + dy, z + dz) — U (x, y, z) 


dU, , dU , dU, 

= - dx H- dy 4- dz 

dx dy dz 


(4.28) 


where the three derivatives are the partial derivatives with respect to the three in¬ 
dependent variables (x, y, z). [For example, dU/dx is the rate of change of U as x 


6 Strictly speaking, this equation is exact only in the limit that dx 0. As usual, I take the view 
that dx is small enough (though nonzero) that the two sides are equal within our chosen accuracy 
target. 
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changes, with y and z fixed, and is found by differentiating U(x,y,z ) with respect 
to x treating y and z as constants. See Problems 4.10 and 4.11 for some examples.] 
Substituting (4.28) into (4.26), we find that the work done in the small displacement 
from r to r 4- dr is 

. _ . \dU , BU , BU , 1 

W(r -* r+dr) = — —dx H- dy H- dz . (4.29) 

L dx dy dz J 

The two expressions (4.25) and (4.29) are both valid for any small displacement dr. 
In particular, we can choose dr to point in the x direction, in which case dy = dz = 0 
and the last two terms in both (4.25) and (4.29) are zero. Equating the remaining terms, 
we see that F x — —dU/dx. By choosing dr to point in the y or z directions, we get 
corresponding results for F y and F z , and we conclude that 


F x = 


dU_ 
dx ’ 




(4.30) 


That is, F is the vector whose three components are minus the three partial derivatives 
of U with respect to x, y, and z. A slightly more compact way to write this result is 
this: 


^ „ BU „dU ,9 U 

F = -x-y-z—. 

dx dy dz 


(4.31) 


Relationships like (4.31) between a vector (F) and a scalar ( U ) come up over and 
over again in physics. For example, the electric field E is related to the electrostatic 
potential V in exactly the same way. More generally, given any scalar /(r), the vector 
whose three components are the partial derivatives of /(r) is called the gradient of 
/, denoted V/: 


v/ = *v + .v +i v 

Bx By dz 


(4.32) 


The symbol V/ is pronounced “grad /.” The symbol V by itself is called “grad,” or 
“del,” or “nabla.” With this notation, (4.31) is abbreviated to 



This important relation gives us the force F in terms of derivatives of U, just as the 
definition (4.13) gave U as an integral of F. When a force F can be expressed in the 
form (4.33), we say that F is derivable from a potential energy. Thus, we have shown 
that any conservative force is derivable from a potential energy. 7 


7 1 am following standard terminology here. Notice that we have defined “conservative” so that 
a conservative force conserves energy and is derivable from a potential energy. This is occasionally 
confusing, since there are forces (such as the magnetic force on a charge or the normal force on a 
sliding object) that do no work and hence conserve energy, but are not “conservative” in the sense 
defined here, since they are not derivable from a potential energy. This unfortunate confusion seldom 
causes trouble, but you may want to register it somewhere in the back of your mind. 
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example 4.4 Finding F from U 

The potential energy of a certain particle is U = Axy 2 + B sin Cz, where A, B 
and C are constants. What is the corresponding force? 

To find F we have only to evaluate the three partial derivatives in (4.31). 
In doing this, you must remember that dU/dx is found by differentiating with 
respect to x, treating y and z as constant, and so on. Thus dU/dx = Ay 2 , and 
so on, and the final result is 

F = — (x Ay 2 + y 2 Axy + z BC cos Cz). 


It is sometimes convenient to remove the / from (4.32) and to write 

v=i f+yf+v- (434) 

dx 3 y 3z 

In this view, V is a vector differential operator that can be applied to any scalar / and 
produces the vector given in (4.32). 

A very useful application of the gradient is given by (4.28), whose right-hand side 
you will recognize as VC • dr. Thus, if we replace U by an arbitrary scalar /, we see 
that the change in / resulting from a small displacement dr is just 

df-Vf-dr. (4.35) 


This useful relation is the three-dimensional analog of Equation (4.27) for a function 
of one variable. It shows the sense in which the gradient is the three-dimensional 
equivalent of the ordinary derivative in one dimension. 

If you have never met the V notation before, it will take a little getting used 
to. Meanwhile, you can just think of (4.33) as a convenient shorthand for the three 
equations (4.30). For practice using the gradient, you could look at Problems 4.12 
through 4.19. 

4.4 The Second Condition that F be Conservative 


We have seen that one of the two conditions that a force F be conservative is that 
the work f^F - dr which it does moving between any two points 1 and 2 must be 
independent of the path followed. You are certainly to be excused if you don’t see 
how we could test whether a given force has this property. Checking the value of 
the integral for every pair of points and every path joining those points is indeed a 
formidable prospect! Fortunately, we never need to do this. There is a simple test, 
which can be quickly applied to any force that is given in analytic form. This test 
involves another of the basic concepts of vector calculus, this time the so-called curl 
of a vector. 
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It can be shown (though I shall not do so here 8 ) that a force F has the desired 
property, that the work it does is independent of path, if and only if 

V x F = 0 (4.36) 


everywhere. The quantity V x F is called the curl of F, or just “curl F,” or “del cross 
F.” It is defined by taking the cross product of V and F just as if the components 
of V, namely (3/3x, 3/3 y, 3/3z), were ordinary numbers. To see what this means, 
consider first the cross product of two ordinary vectors A and B. In the table below, I 
have listed the components of A, B, and A x B: 


vector x component y component z component 
A A x A y A z 

B B x B y B z 

A x B A y B z - A z B y A Z B X - A X B Z A x B y - A y B x 


(4.37) 


The components of V x F are found in exactly the same way, except that the entries 
in the first row are differential operators. Thus, 

vector x component y component z component 


V 

F 

V x F 


3/3x 

F r 


3/3 y 

Fy 


3/3 z 
F z 


(4.38) 


No one would claim that (4.36) is obviously equivalent to the condition that 
/ 2 F • dr is path-independent, but it is, and it provides an easily applied test for the 
path-independence property, as the following example shows. 


example 4.5 Is the Coulomb Force Conservative? 

Consider the force F on a charge q due to a fixed charge Q at the origin. Show 
that it is conservative and find the corresponding potential energy U. Check that 
-Vi/ = F. 

The force in question is the Coulomb force, as shown in Figure 4.7(a), 



where k denotes the Coulomb force constant, often written as l/(4^e 0 ), and y 
is just an abbreviation for the constant kqQ. From the last expression we can 
read off the components of F, and using (4.38) we can calculate the components 
of V x F. For example, the x component is 


(V xF), = - 

3 y 




8 The condition (4.36) follows from a result called Stokes’s theorem. If you would like to explore 
this a little, see Problem 4.25. For more details, see any text on vector calculus or mathematical 
methods. I particularly like Mathematical Methods in the Physical Sciences by Mary Boas (Wiley, 
1983), p. 260. 
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Q Q r 0 P 


(a) (b) 

Figure 4.7 (a) The Coulomb force F = yr/r 2 of the fixed 

charge Q on the charge q. (b) The work done by F as q 
moves from r 0 to r can be evaluated following a path that 
goes radially outward to P and then around a circle to r. 


The two derivatives here are easily evaluated: First, since dz/dy = dy/dz = 0, 
we can rewrite (4.40) as 

(V X F), = yz _ yy . (4.41) 

Next recall that 


r = (x 2 + y 2 + z 2 ) l/2 , 


so that, for example, 


dr _ y 
dy r 


(4.42) 


(Check this one using the chain rule.) We can now evaluate the two remaining 
| derivatives in (4.41) to give (remember the chain rule again) 

(VxFX = kz(^.Z)- W (^.^)=0. 

The other two components work in exactly the same way (check it, if you don’t 
j believe me), and we conclude that V x F = 0. According to the result (4.36), 
I this guarantees that F satisfies the second condition to be conservative. Since 
j it certainly satisfies the first condition (it depends only on the variable r), we 
j have proved that F is conservative. (The proof that V x F = 0 is considerably 
j quicker in spherical polar coordinates. See Problem 4.22.) 
j The potential energy is defined by the work integral (4.13), 


U (r) = - 


F(r') - dr' 


(4.43) 


| where r 0 is the (as yet unspecified) reference point where U (r 0 ) = 0. Fortu- 
| nately, we know that this integral is independent of path, so we can choose 
j whatever path is most convenient. One possibility is shown in Figure 4.7(b), 
where I have chosen a path that goes radially outward to the point labeled P 
* and then around a circle (centered on Q ) to r. On the first segment, F(r') and 
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dr' are collinear, and F(r') • dr' = ( y/r' 2 )dr'. On the second, F(r') and dr' are 
perpendicular, so no work is done along this segment, and the total work is just 
that of the first segment, 


U (r) 


Jr 0 r' 2 r r Q 


(4.44) 


Finally, it is usual in this problem to choose the reference point r 0 at infinity, so 
that the second term here is zero. With this choice (and replacing y by kq Q ) we 
arrive at the well-known formula for the potential energy of the charge q due 
to Q, 


U (r) = U(r ) = 


kqQ 


(4.45) 


Notice that the answer depends only on the magnitude r of the position vector 
r and not on the direction. 

To check VC/ let us evaluate the * component: 


( vi/) t = JL ( k jQ\ = -tlR. ?L 


dx 


(4.46) 


where the last expression follows from the chain rule. The derivative dr/dx is 
x/r [compare Equation (4.42)], so 

C VU) x = -kqQ- = -F x , 
r j 

as given by (4.39). The other two components work in exactly the same way, 
and we have shown that 

VC/ = -F (4.47) 


as required. 


4.5 Time-Dependent Potential Energy 


We sometimes have occasion to study a force F(r,r) that satisfies the second condition 
to be conservative (V x F = 0), but, because it is time-dependent, does not satisfy 
the first condition. In this case, we can still define a potential energy U (r, t) with 
the property that F = — VC/, but it is no longer the case that total mechanical energy, 
E = T + U, is conserved. Before I justify these claims, let me give an example of this 
situation. Figure 4.8 shows a small charge q in the vicinity of a charged conducting 
sphere (for example, a Van de Graaff generator) with a charge Q(t ) that is slowly 
leaking away through the moist air to ground. Because Q(t ) changes with time, the 
force that it exerts on the small charge q is explicitly time-dependent. Nevertheless, 
the spatial dependence of the force is the same as for the time-independent Coulomb 
force of Example 4.5 (page 119). Exactly the same analysis as in that example shows 
that V x F = 0. 
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Figure 4.8 The charge Q(t) on the conducting sphere is 
slowly leaking away, so the force on the small charge q 
varies with time, even if its position r is constant. 


Let me now justify the claims made above. First, since V xF(r,() = 0, the same 
mathematical theorem quoted in connection with Equation (4.36) guarantees that the 
work integral F(r, t) •dr (evaluated at any one time t) is path independent. This 
means we can define a function U (r, t ) by an integral exactly analogous to (4.13), 

U (r, 0 = - J F(r', t ) • dr', (4.48) 

and, for the same reasons as before, F(r,f) = — V£/(r, t). (See Problem 4.27.) In this 
case, we can say the force F is derivable from the time-dependent potential energy 
U( r, t ). 

So far everything has gone through just as before, but now the story changes. We 
can define the mechanical energy as E — T + U, but it is no longer true that E is 
conserved. If you review carefully the argument leading to Equation (4.19), you may 
be able to see what goes wrong, but we can in any case show directly that E = T + U 
changes as the particle moves on its path. As before, consider any two neighboring 
points on the particle’s path at times t and t + dt. Exactly as in (4.4), the change in 
kinetic energy is 


dT , 


(4.49) 

Meanwhile, U (r, t) = U(x, y, z, t ) is a function of four variables (x, y, z, t) and 

(4.50) 

You will recognize the first three terms on the right as \U -dr = —F • dr. Thus 

(4.51) 

When we add this to Equation (4.49) the first two terms cancel, and we are left with 


dT = —dt = (rav • \)dt = F • dr. 
dt 


JTr du J du du J du , 

dU =- dx H- dy -|- dz -|- -dt. 

dx dy dz dt 


dU = -F-dr+ — dt. 

dt 


d(T + U) = — dt. (4.52) 

dt 

Clearly it is only when U is independent of t (that is, 3 U/dt = 0) that the mechanical 
energy E = T + U is conserved. 



Section 4.6 Energy for Linear One-Dimensional Systems 


123 


Returning to the example of Figure 4.8, we can understand this conclusion and 
see what has happened to conservation of energy. Imagine that I hold the charge q 
stationary at the position of Figure 4.8, while the charge on the sphere leaks away. 
Under these conditions, the KE of q doesn’t change, but the potential energy kq Q(t)/r 
slowly diminishes to zero. Clearly T + U is not constant. However, while mechanical 
energy is not conserved, total energy is conserved: The loss of mechanical energy 
is exactly balanced by the gain of thermal energy as the discharge current heats up 
the surrounding air. This example suggests, what is true, that the potential energy 
depends explicitly on time in precisely those situations where mechanical energy gets 
transformed to some other form of energy or to mechanical energy of other bodies 
external to the system of interest. 


4.6 Energy for Linear One-Dimensional Systems 


So far we have discussed the energy of a particle that is free to move in all three 
dimensions. Many interesting problems involve an object that is constrained to move 
in just one dimension, and the analysis of such problems is remarkably simpler than 
the general case. Oddly enough, there is some ambiguity in what a physicist means 
by a “one-dimensional system.” Many introductory physics texts start out discussing 
the motion of a one-dimensional system, by which they mean an object (a railroad 
car, for instance) that is confined to move on a perfectly straight, or linear, track. 
In discussing such linear systems, we naturally take the x axis to coincide with the 
track, and the position of the object is then specified by the single coordinate x. In 
this section I shall focus on linear one-dimensional systems. However, there are much 
more complicated systems, such as a roller coaster on its curving track, that are also 
one-dimensional, inasmuch as their position can be specified by a single parameter 
(such as the distance of the roller coaster along its track). As I shall discuss in the next 
section, energy conservation for such curvilinear one-dimensional systems is just as 
straightforward as for a perfectly straight track. 

To begin, let us consider an object constrained to move along a perfectly straight 
track, which we take to be the x axis. The only component of any force F that can 
do work is the x component, and we can simply ignore the other two components. 
Therefore the work done by F is the one-dimensional integral 

W(Xi -+ x 2 ) = jf 2 F x (x) dx. (4.53) 

If the force is to be conservative, F x must satisfy the two usual conditions: (i) It must 
depend only on the position x [as I have already implied in writing the integral (4.53)]. 
(ii) The work (4.53) must be independent of path. The remarkable feature of one¬ 
dimensional systems is that the first condition already guarantees the second, so the 
latter is superfluous. To understand this property, you have only to recognize that 
in one dimension there is only a small choice of paths connecting any two points. 
Consider, for example, the two points A and B shown in Figure 4.9. The obvious path 
between points A and B is the path that goes from A directly to B (let’s call this path 
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A B<=- C 

Figure 4.9 The path called ABCB goes from A past B 
and on to C, then back to B. 


“ AB ”). Another possibility, shown in the figure, is to go from A past B to C and then 
back to B (let’s call this one “ABCB"). The work done along this path can be broken 
up as follows: 


W(ABCB) = W(AB) + W(BC ) + W(CB). 

Now, provided the force depends only on the position x [condition (i)] each 
increment of work going from B to C is exactly equal (but of opposite sign) to the 
corresponding contribution going from C to B. That is, the last two terms on the right 
cancel, and we conclude that 


W(ABCB) = W(AB), 

as required. One can of course concoct a path from A to B that doubles back and 
forth many times, but a little thought should convince you that any such path can be 
broken into a number of segments some of which together traverse the direct path AB 
exactly once, and all the rest of which cancel in pairs. Thus the work done on any 
path between A and B is the same as that on the direct path AB, and we have proved 
that in one dimension the first condition for a force to be conservative guarantees the 
second. 

Graphs of the Potential Energy 

A second useful feature of one-dimensional systems is that with only one independent 
variable (x) we can plot the potential energy U (x), and, as we shall see, this makes 
it easy to visualize the behavior of the system. Assuming all forces on the object are 
conservative, we define the potential energy as 

U(x) = — j F x (x')dx' (4.54) 

where F x is the x component of the net force on the particle. For example, for a mass 
on the end of a spring obeying Hooke’s law, the force is F x = —kx, and, if we choose 
the reference point x 0 = 0, Equation (4.54) gives the celebrated result 

U = \kx 2 


for any spring obeying Hooke’s law. 

Corresponding to the three-dimensional result F = -Vt/, we have the simpler 
result in one dimension 


F x - 


dU_ 

dx 


(4.55) 
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Figure 4.10 The graph of potential energy U (x) against 
x for any one-dimensional system can be thought of as a 
picture of a roller coaster track. The force F x — —dU/dx 
tends to push the object “downhill” as at x x and x 2 . At the 
points x 3 and x 4 , where U (x) is minimum or maximum, 
dU/dx = 0 and the force is zero; such points are therefore 
points of equilibrium. 


If we plot the potential energy against x as in Figure 4.10, we can easily see qualita¬ 
tively how the object has to behave. The direction of the net force is given by (4.55) as 
“downhill” on the graph of U (x) — to the left at x, and to the right at x 2 . It follows that 
the object always accelerates in the “downhill” direction — a property that reminds 
one of the motion of a roller coaster, which also always accelerates downhill. This 
analogy is not an accident: For a roller coaster, U (x) is mgh (where h is the height 
above ground) and the graph of U(x) against x has the same shape as a graph of h 
against x, which is just a picture of the track. For any one-dimensional system, we 
can always think about the graph of U(x) as a picture of a roller coaster, and common 
sense will generally tell us the kind of motion that is possible at different places, as I 
now describe. 

At points, such as x 3 and x 4 , where dU/dx = 0 and U (x) is minimum or maximum, 
the net force is zero, and the object can remain in equilibrium. That is, the condition 
dU/dx = 0 characterizes points of equilibrium. At x 3 , where d 2 U/dx 2 > 0 and U (x) 
is minimum, a small displacement from equilibrium causes a force which pushes the 
object back to equilibrium (back to the left on the right of x 3 , back to the right on the left 
of x 3 ). In other words, equilibrium points where d 2 U/dx 2 > 0 and U (x) is minimum 
are points of stable equilibrium. At equilibrium points like x 4 where d 2 U/dx 2 < 0 
and U(x) is maximum, a small displacement leads to a force away from equilibrium, 
and the equilibrium is unstable. 

If the object is moving then its kinetic energy is positive and its total energy is 
necessarily greater than U(x). For example, suppose the object is moving some¬ 
where near the equilibrium point x —b in Figure 4.11. Its total energy has to be 
greater than U (b) and could, for example, equal the value shown as E in that fig¬ 
ure. If the object happens to be on the right of b and moving toward the right, its 
PE will increase and its KE must therefore decrease until the object reaches the 
turning point labeled c, where U(c) = E and the KE is zero. At x = c the object 
stops and, with the force back to the left, it accelerates back toward x = b. It can¬ 
not now stop until once again the KE is zero, and this occurs at the turning point 
a, where U(a) — E and the object accelerates back to the right. Since the whole 
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f\ 



a b 

c 



Figure 4.11 If an object starts out near x — b with the 
energy E shown, it is trapped in the valley or “well” 
between the two hills and oscillates between the turning 
points at x = a and c where U(x) — E and the kinetic 
energy is zero. 


cycle now repeats itself, we see that if the object starts out between two hills and 
its energy is lower than the crest of both hills, then the object is trapped in the 
valley or “well” and oscillates indefinitely between the two turning points where 
U(x) = E. 

Suppose the cart again starts out between the two hills but with energy higher than 
the crest of the right hill though still lower than the left. In this case, it will escape 
to the right since E > U(x) everywhere on the right, and it can never stop once it is 
moving in that direction. Finally, if the energy is higher than both hills, the cart can 
escape in either direction. 

These considerations play an important role in many fields. An example from 
molecular physics is illustrated in Figure 4.12, which shows the potential energy of a 
typical diatomic molecule, such as HC1, as a function of the distance between the two 
atoms. This potential energy function governs the radial motion of the hydrogen atom 
(in the case of HC1) as it vibrates in and out from the much heavier chlorine atom. 
The zero of energy has been chosen where the two atoms are far apart (at infinity) 
and at rest. Notice that the independent variable is the interatomic distance r which, 
by its definition, is always positive, 0 < r < oo. As r -4- 0, the potential energy gets 
very large, indicating that the two atoms repel one another when very close together 
(because of the Coulomb repulsion of the nuclei). If the energy is positive (E > 0) 
the H atom can escape to infinity, since there is no “hill” to trap it; the H atom can 
come in from infinity, but it will stop at the turning point r = a and (in the absence of 
any mechanism to take up some of its energy) it will move away to infinity again. On 
the other hand, if E < 0, the H atom is trapped and will oscillate in and out between 
the two turning points shown at r — b and r =d. The equilibrium separation of the 
molecule is at the point shown as r = c. It is the states with E < 0 that correspond to 
what we normally regard as the HC1 molecule. To form such a molecule, two separate 
atoms (with E > 0) must come together to a separation somewhere near r = c, and 
some process, such as emission of light, must remove enough energy to leave the two 
atoms trapped with E < 0. 
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Figure 4.12 The potential energy for a typical diatomic 
molecule such as HC1, plotted as a function of the distance 
r between the two atoms. If E > 0, the two atoms cannot 
approach closer than the turning point r = a, but they can 
move apart to infinity. If E < 0, they are trapped between the 
turning points at b and d and form a bound molecule. The 
equilibrium separation is r — c. 


Complete Solution of the Motion 

A third remarkable feature of one-dimensional conservative systems is that we can — 
at least in principle — use the conservation of energy to obtain a complete solution of 
the motion, that is, to find the position x as a function of time t. Since E = T + U (x) 
is conserved, with U(x) a known function (in the context of a given problem) and E 
determined by the initial conditions, we can solve for T — I rax 2 = E — U(x) and 
hence for the velocity x as a function of x: 

x(x) = ±J-y/E - U(x). (4.56) 

V m 

(Notice that there is an ambiguity in the sign since energy considerations cannot 
determine the direction of the velocity. For this reason, the method described here 
usually does not work in a truly three-dimensional problem. In one dimension, you 
can almost always decide the sign of x by inspection, though you must remember to 
do so.) 

Knowing the velocity as a function of x, we can now find x as a function of t, 
using separation of variables, as follows: We first rewrite the definition x = dx/dt as 



[Since x = x(x), this separates the variables t and x.] Next, we can integrate between 
any initial and final points to give 


k ~ h = 



(4-57) 
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This gives the time for travel between any initial and final positions of interest. If we 
substitute for x from (4.56) (and assume, to be definite, that x is positive) then the 
time to go from the initial x 0 at time 0 to an arbitrary x at time t is 


t = 


dx' 

x(x') 


[m dx' 

V 2 J Xo JE-UO c')' 


(4.58) 


(As usual. I’ve renamed the variable of integration as x' to avoid confusion with 
the upper limit x.) The integral (4.58) depends on the particular form of U(x ) in 
the problem at hand. Assuming we can do the integral [and we can at least do it 
numerically for any given U (x)], it gives us t as a function of x. Finally we can solve 
to give x as a function of t, and our solution is complete, as the following simple 
example illustrates. 


example 4.6 Free Fall 

I drop a stone from the top of a tower at time t = 0. Use conservation of energy 
to find the stone’s position x (measured down from the top of the tower, where 
x = 0) as a function of t. Neglect air resistance. 

The only force on the stone is gravity, which is, of course, conservative. The 
corresponding potential energy is 

U (x) = —mgx. 

(Remember x is measured downward.) Since the stone is at rest when x = 0, 
the total energy is E = 0, and according to (4.56) the velocity is 

x(x) = /-yjE - U(x) = 

V m 

(a result that is well known from elementary kinematics). Thus 



As anticipated, this gives t as a function of x, and we can solve to give the 
familiar result 


x = \gt 2 . 

This simple example, involving the gravitational potential energy U (x) s= 
—mgx, can be solved many different (and some simpler) ways, but the energy 
method used here can be used for any potential energy function U(x). In some 
cases, the integral (4.58) can be evaluated in terms of elementary functions, and 
we obtain an analytic solution of the problem; for example, if U (x) = \kx 2 (as 
for a mass on the end of a spring), the integral turns out to be an inverse sine 
function, which implies that x oscillates sinusoidally with time, as we should 
expect (see Problem 4.28). For some potential energies, the integral cannot be 
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done in terms of elementary functions, but can nonetheless be related to func¬ 
tions that are tabulated (see Problem 4.38). For some problems, the only way to 
do the integral (4.58) is to do it numerically. 


4.7 Curvilinear One-Dimensional Systems 


So far the only one-dimensional system I have discussed is an object constrained to 
move along a linear path, with position specified by the coordinate x. There are other, 
more general, systems that can equally be said to be one-dimensional, inasmuch as 
their position is specified by a single number. An example of such a one-dimensional 
system is a bead threaded on a curved rigid wire as illustrated in Figure 4.13. (Another 
is a roller coaster confined to a curved track.) The position of the bead can be specified 
by a single parameter, which we can choose as the distance 5, measured along the wire, 
from a chosen origin O. With this choice of coordinate, the discussion of the curved 
one-dimensional track parallels closely that of the straight track, as I now show. 

The coordinate s of our bead corresponds, of course, to x for a cart on a straight 
track. The speed of the bead is easily seen to be i, and the kinetic energy is therefore 
just 


T = 


as compared to the familiar jinx 2 for the straight track. The force is a little more 
complicated. As our bead moves on the curved wire the net normal force is not zero; 
on the contrary, the normal force is what constrains the bead to follow its assigned 
curving path. (For this reason, the normal force is called the force of constraint.) On 
the other hand, the normal force does no work, and it is the tangential component 
F tang of the net force that is our chief concern. In particular, it is fairly easy to show 
(Problem 4.32) that 



Figure 4.13 An object constrained to move on a curved 
track can be considered to be a one-dimensional system, 
with the position specified by the distance s (measured 
along the track) of the object from an origin O. The system 
shown is a bead threaded on a stiff wire, bent into a double 
loop-the-loop. 
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(just as F x = mx on a straight track). Further, if all the forces on the bead that have 
a tangential component are conservative, we can define a corresponding potential 
energy U (v) such that F tang = —dll/ds, and the total mechanical energy E = T + 
U ( s ) is constant. The whole discussion of Section 4.6 can now be applied to the bead 
on a curved wire (or any other object constrained to move on a one-dimensional path). 
In particular, those points where U ( s ) is a minimum are points of stable equilibrium, 
and those where U(s) is maximum are points of unstable equilibrium. 

There are many systems that appear to be much more complicated than the bead 
on a wire, but are nonetheless one-dimensional and can be treated in much the same 
way. Here is an example. 


example 4.7 Stability of a Cube Balanced on a Cylinder 

j A hard rubber cylinder of radius r is held fixed with its axis horizontal, and 
i a wooden cube of side 2b is balanced on top of the cylinder, with its center 
vertically above the cylinder’s axis and four of its sides parallel to the axis. The 
cube cannot slip on the rubber of the cylinder, but it can of course rock from 
j side to side, as shown in Figure 4.14. By examining the cube’s potential energy, 
find out if the equilibrium with the cube centered above the cylinder is stable or 
[ unstable. 

Let us first note that the system is one-dimensional, since its position as it 
1 rocks from side to side can be specified by a single coordinate, for instance the 
angle 6 through which it has turned. (We could also specify it by the distance £ 
of the cube’s center from equilibrium, but the angle is a little more convenient. 
Either way the system’s position is specified by a single coordinate, and our 
problem is definitely one-dimensional.) The constraining forces are the normal 



Figure 4.14 A cube, of side 2b and center C, is 
placed on a fixed horizontal cylinder of radius r and 
center O. It is originally put so that C is centered 
above O, but it can roll from side to side without 
slipping. 
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and frictional forces of the cylinder on the cube; that is, these two forces 
constrain the cube to move only as shown in Figure 4.14. Since neither of these 
does any work we need not consider them explicitly. The only other force on the 
cube is gravity, and we know from elementary physics that this is conservative 
and that the gravitational potential energy is the same as for a point mass at the 
center of the cube; that is, U = mgh, where h is the height of C above the origin, 
as shown in Figure 4.14. (See Problem 4.6.) The length of the line shown as OB 
is just r + b, while the length BC is the distance the cube has rolled around the 
cylinder, namely rO. Therefore h = (r + b) cos # + rO sin # and the potential 
energy is 

(/(#) = mgh = mg[(r + b) cos# + r# sin#]. (4.59) 

To find the equilibrium position (or positions) we must find the points where 
dU/d6 vanishes. (Strictly speaking I haven’t proved this very plausible claim 
yet for this kind of constrained system; I’ll discuss it shortly.) The derivative is 
easily seen to be (check this for yourself) 

j = mg[rO cos # — b sin #]. 

This vanishes at # = 0, confirming the obvious — that # = 0 is a point of 
equilibrium. To decide whether this equilibrium is stable, we have only to 
differentiate again and find the value of d 2 U/dO 2 at the equilibrium position. 
This gives (as you should check) 

= mg(r-b ) (4.60) 

d6 z 

(at # = 0). If the cube is smaller than the cylinder (that is, b < r), this second 
derivative is positive, which means that U{6) has a minimum at # = 0 and the 
equilibrium is stable; if the cube is balanced on the cylinder, it will remain there 
indefinitely. On the other hand, if the cube is larger than the cylinder (b > r), the 
second derivative (4.60) is negative, the equilibrium is unstable, and the smallest 
disturbance will cause the cube to roll and fall off the cylinder. 


Further Generalizations 

There are many other, more complicated systems that are still legitimately described 
as one dimensional. Such systems may comprise several bodies, but the bodies are 
joined by struts or strings in such a way that just one parameter is needed to describe 
the system’s position. An example of such a system is the Atwood machine shown in 
Figure 4.15, which consists of two masses, m x and m 2 , suspended from opposite ends 
of a massless, inextensible string that passes over a frictionless pulley. (To simplify 
the discussion, I shall assume the pulley is massless, although it is easy to allow for 
a mass of the pulley.) The two masses can move up and down, but the forces of the 
pulley on the string and the string on the masses constrain matters so that the mass 
m 2 can move up only to the extent that raj moves down by exactly the same distance. 
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Figure 4.15 An Atwood machine consisting of two masses, 
m, and m 2 , suspended by a massless inextensible string 
that passes over a massless, frictionless pulley. Because the 
string’s length is fixed, the position of the whole system is 
specified by the distance x of m l below any convenient fixed 
level. The forces on the two masses are their weights m x g 
and m 2 g, and the tension forces F T (which are equal since 
the pulley and string are massless). 


Thus the position of the whole system can be specified by a single parameter, for 
example the height x of m x below the pulley’s center as shown, and the system is 
again one-dimensional. 9 

Let us consider the energies of the masses m x and m 2 . The forces acting on 
them are gravity and the tension in the string. Since gravity is conservative, we can 
introduce potential energies U x and U 2 for the gravitational forces, and our previous 
considerations imply that in any displacement of the system, 

A7\ + A U x = Wp (4.61) 

and 

AT 2 + AU 2 = Wf 1 (4.62) 

where the terms W ien denote the work done by the tension on m x and m 2 . Now, in 
the absence of friction, the tension is the same all along the string. Thus, although 
the tension certainly does work on the two individual masses, the work done on mj is 
equal and opposite to that done on m 2 , when m x moves down and m 2 moves an equal 
distance up (or vice versa). That is, 

W\™ = -W' en . (4.63) 


9 You may object, correctly, that the masses can also move sideways. If this worries you, we can 
thread each mass over a vertical frictionless rod, but these rods are actually unnecessary: As long as 
we refrain from pushing the masses sideways, each will remain in a vertical line of its own accord. 
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Thus, if we add the two energy equations (4.61) and (4.62), the terms involving the 
tension in the string cancel and we are left with 


A (7j + U x + T 2 + U 2 ) = 0. 


That is, the total mechanical energy 


E = T t + U x + T 2 + U 2 (4.64) 

is conserved. The beauty of this result is that all reference to the constraining forces 
of the string and pulley has disappeared. 

It turns out that many systems which contain several particles that are constrained 
in some way (by strings, struts, or a track on which they must move, etc.) can be treated 
in this same way: The constraining forces are crucially important in determining 
how the system moves, but they do no work on the system as a whole. Thus in 
considering the total energy of the system, we can simply ignore the constraining 
forces. In particular, if all other forces are conservative (as with our example of the 
Atwood machine), we can define a potential energy U a for each particle a, and the 
total energy 

N 

E = + U «) 

a=l 

is constant. If the system is also one-dimensional (position specified by just one 
parameter, as with the Atwood machine), then all of the considerations of Section 
4.6 apply. 

A careful discussion of constrained systems is far easier in the Lagrangian formula¬ 
tion of mechanics than in the Newtonian. Thus I shall postpone any further discussion 
to Chapter 7. In particular, the proof that a stable equilibrium normally corresponds 
to a minimum of the potential energy (for a large class of constrained systems) is 
sketched in Problem 7.47. 


4.8 Central Forces 


A three-dimensional situation that has some of the simplicity of one-dimensional 
problems is a particle that is subject to a central force, that is, a force that is everywhere 
directed toward or away from a fixed “force center.” If we take the force center to be 
the origin, a central force has the form 


F(r) = /(r)r 


(4.65) 


where the function /(r) gives the magnitude of the force (and is positive if the force 
is outward and negative if it is inward). An example of a central force is the Coulomb 
force on a charge q due to a second charge Q at the origin; this has the familiar form 


F(r) = tigr, 


(4.66) 
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which is obviously an example of (4.65), with the magnitude function given by 
/(r) —kqQjr 1 . The Coulomb force has two additional properties not shared by all 
central forces: First, as we have proved, it is conservative. Second, it is spherically 
symmetric or rotationally invariant; that is, the magnitude function /( r) in (4.65) 
is independent of the direction of r and, hence, has the same value at all points at 
the same distance from the origin. A compact way to express this second property of 
spherical symmetry is to observe that the magnitude function /(r) depends only on 
the magnitude of the vector r and not its direction, so can be written as 

m = m. (4.67) 

A remarkable feature of central forces is that the two properties just mentioned 
always go together: A central force that is conservative is automatically spherically 
symmetric, and, conversely, a central force that is spherically symmetric is automati¬ 
cally conservative. These two results can be proved in several ways, but the most direct 
proofs involve the use of spherical polar coordinates. Therefore, before offering any 
proofs, I shall briefly review the definition of these coordinates. 


Spherical Polar Coordinates 

The position of any point P is, of course, identified by the vector r pointing from the 
origin O to P. The vector r can be specified by its Cartesian coordinates (x, y, z), 
but in problems involving spherical symmetry it is almost always more convenient to 
specify r by its spherical polar coordinates (r, 0, 0), as defined in Figure 4.16. The 
first coordinate r is just the distance of P from the origin; that is, r = |r |, as usual. The 
angle 9 is the angle between r and the z axis. The angle 0, often called the azimuth, 
is the angle from the x axis to the projection of r on the xy plane, as shown. 10 It is 
a simple exercise (Problem 4.40) to relate the Cartesian coordinates ( x,y,z ) to the 
polar coordinates (r, 0, 0) and vice versa. For example, by inspecting Figure 4.16 you 
should be able to convince yourself that 

x — r sin 0 cos 0, y = rsin0sin0, and z = rcos0. (4.68) 

A beautiful use of spherical coordinates, which may help you to visualize them, 
is to specify positions on the surface of the earth. If we choose the origin at the center 
of the earth, then all points on the surface have the same value of r, namely the radius 
of the earth. 11 Thus positions on the surface can be specified by giving just the two 
angles (0,0). If we choose our z axis to coincide with the north polar axis, then it is 
easy to see from Figure 4.16 that 9 gives the latitude of the point P, measured down 
from the north pole. (Since latitude is traditionally measured up from the equator, our 
angle 9 is often called the colatitude.) Similarly, 0 is the longitude measured east from 
the meridian of the x axis. 


10 You should be aware that, while the definitions given here are those always used by physicists, 
most mathematics texts reverse the roles of 9 and 0. 

11 Actually the earth isn’t perfectly spherical, so r isn’t quite constant, but this doesn’t change 
the conclusion that any position on the surface can be specified by giving 9 and 0. 
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Figure 4.16 The spherical polar coordinates ( r , 9, 0) of a 
point P are defined so that r is the distance of P from the 
origin, 9 is the angle between the line OP and the z axis, 
and 0 is the angle of the line OQ from the x axis, where 
Q is the projection of P onto the xy plane. 


The statement that a function /(r) is spherically symmetric is simply the statement 
that, with r expressed in spherical polars, / is independent of 0 and 0. This is what 
we mean when we write /(r) = fir), and the test for spherical symmetry is simply 
that the two partial derivatives df/dd and 3//30 are both zero everywhere. 

The unit vectors f , 0, and 0 are defined in the usual way: First, r is the unit vector 
pointing in the direction of movement if r increases with 0 and 0 fixed. Thus, as shown 
in Figure 4.17, the vector ? points radially outward, and is just the unit vector in the 
direction of r as usual. (On the surface of the earth, r points upward, in the direction 
of the local vertical.) Similarly, 9 points in the direction of increasing 9 with r and 0 
fixed, that is, southward along a line of longitude. Finally, 0 points in the direction of 
increasing 0 with r and 9 fixed, that is, eastward along a circle of latitude. 

Since the three unit vectors r, 6, and 0 are mutually perpendicular, we can evaluate 
dot products in spherical polars in just the same way as in Cartesians. Thus, if 


and 


b = b r r + b e 9 + 


then (make sure you see this) 


a • b = a r b r + a e b 9 + a^. (4.69) 

Like the unit vectors of two-dimensional polar coordinates, the unit vectors ?, 0, 
and 0 vary with position, and, as was the case in two dimensions, this variability 
complicates many calculations involving differentiation, as we shall now see. 
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The Gradient in Spherical Polar Coordinates 

In Cartesian coordinates, we have seen that the components of V/ are precisely the 
partial derivatives of / with respect to x, y, and z, 

V/ = x— + y— + z—. (4.70) 

dx dy d z 

The corresponding expression for V/ in polar coordinates is not so straightforward. 
To find it, recall from (4.35) that, in a small displacement dr, the change in any 
function /(r) is 


df = Vf-dr. (4.71) 

To evaluate the small vector Jr in polar coordinates, we must examine carefully what 
happens to the point r when we change r, 6, and 0: A small change dr in r moves 
the point a distance dr radially out, in the direction of r. As you can see from Figure 
4.17, a small change dO in 6 moves the point around a circle of longitude (radius r) 
through a distance rdO in the direction of 0. (Note well the factor of r — the distance 
is not just dd.) Similarly, a small change J0 in 0 moves the point around a circle of 
latitude (radius r sin 6) through a distance r sin 0 J0. Putting all this together, we see 
that 


dr = dr r + r dO 6 + r sin 6 d(p <j>. 

Knowing the components of dr, we can now evaluate the dot product in (4.71) in 
terms of the unknown components of V/, 

df = (V/) r dr + (V/), r d6 + (V/) 0 r sin 0 J0. (4.72) 



Figure 4.17 The three unit vectors of spherical polar co¬ 
ordinates at the point P. The vector r points radially out, 
0 points “south” along a line of longitude, and 0 points 
“east” around a circle of latitude. 
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Meanwhile, since / is a function of the three variables r, 0, 0, the change in / is, of 
course, 

df = —dr + ^-dO + ^ di >. (4.73) 

dr d6 30 


Comparing (4.72) and (4.73), we conclude that the components of V/ in spherical 
polars are 




and 


WV>=—I—^ (4.74) 

r sm 6 dtp 


or, a little more compactly, 


3r r30 rsin0 30 


(4.75) 


Similar considerations apply to the curl and other operators of vector calculus, all 
of which are markedly more complicated in spherical polar coordinates (and all other 
non-Cartesian coordinates) than in Cartesian coordinates. Since the formulas for these 
operators are very hard to remember, I have listed the more important ones inside the 
back cover. Proofs can be found in any textbook of vector calculus. 12 Armed with 
these ideas, let us return to central forces. 


Conservative and Spherically Symmetric, Central Forces 

I claimed earlier that a central force is conservative if and only if it is spherically 
symmetric. This claim can be proved several different ways. The quickest proofs 
(though not necessarily the most insightful) use spherical polar coordinates. Let us 
assume first that the central force F(r) is conservative and try to prove that it must be 
spherically symmetric. Since it is conservative, it can be expressed in the form —WU, 
which according to (4.75), has the form 


F(r) = -VE7 = 


r d 9 




i du 

r sin 6 d<f> 


(4.76) 


Since F(r) is central, only its radial component can be nonzero, and the last two 
terms in (4.76) must be zero. This requires that dU/dd — 3(7/30 = 0; that is, U (r) 
is spherically symmetric, and (4.76) reduces to 


Since U is spherically symmetric (depends only on r), the same is true of dU/dr, 
and we see that the central force F(r) is indeed spherically symmetric. I shall leave 
the proof of the converse result, that a central force which is spherically symmetric is 
necessarily conservative, to the problems at the end of this chapter. (See Problems 4.43 


12 See, for example, Mary L. Boas, Mathematical Methods in the Physical Sciences, John Wiley, 
1983, p. 431. 
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and 4.44, but the simplest proof mimics almost exactly the analysis of the Coulomb 
force in Example 4.5.) 

The importance of these results is this: First, because a force F(r) that is central and 
spherically symmetric has a magnitude that depends only on r, it is nearly as simple 
as a one-dimensional force. Second, although F(r) is certainly not actually a one¬ 
dimensional force (its direction still depends on 9 and 0), we shall see in Chapter 8 
that any problem involving this kind of force is mathematically equivalent to a certain 
related one-dimensional problem. 


4.9 Energy of Interaction of Two Particles 


Almost all of our discussion of energy has focused on the energy of a single particle 
(or any larger object that can be approximated as a particle). It is now time to extend 
the discussion to systems of several particles, and I shall naturally start with just two 
particles. In this section, I shall suppose that the two particles interact via forces F 12 
(on particle 1 by particle 2) and F 21 (on particle 2 by particle 1), but that there are no 
other, external, forces. In general, the force F 12 could depend on the positions of both 
particles, so can be written as 


F 12 — F 12( r b r 2)> 


and by Newton’s third law 


F 12 = ~ F 21- 

As an example of such a two-particle system we could consider an isolated binary 
star, in which case the only two forces are the gravitational attraction of each star for 
the other. If we denote the vector pointing to star 1 from star 2 by r, as in Figure 4.18, 
the force F 12 is just the familiar 

„ _ Gm x m 2 , Gm x m 2 

F n =- r~ r = -r~ r - 

r i r i 



Figure 4.18 The vector r pointing to particle 1 
from particle 2 is just r = (r t — r 2 ). 
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The vector r can be written in terms of the two positions and r 2 . In fact, as can be 
seen in Figure 4.18, 


r = r ! 


— r 2- 


Thus the force F 12 , expressed as a function of r, and r 2 , is 


Fi2 = 


Gm l m 2 
l r i - r 2 l 3 


( r i - r 2 ). 


(4.77) 


A striking property of the force (4.77) is that it depends on the two positions rj and 
r 2 only through the particular combination I*! — r 2 . This property is not an accident, 
and is in fact true of any isolated two-particle system. The reason is that any isolated 
system must be translationally invariant: If we bodily translate the system to a 
new position, without changing the relative positions of the particles, the interparticle 
forces should remain the same. This is illustrated in Figure 4.19, which shows a pair 
of points rj and r 2 and a second pair of points s, and s 2 , with Sj — s 2 = r, — r 2 . Since 
the two points and r 2 could be simultaneously translated to Sj and s 2 , the force 
F i2 (ri, r 2 ) must be the same as F, 2 (s 1 , s 2 ) for any points satisfying r t — r 2 = Sj — s 2 . 
In other words, F 12 (rj, r 2 ) depends only on rj - r 2 , as claimed, and we can write 


F12 — Fi 2 ( r i — r 2 )- 


(4.78) 


The result (4.78) greatly simplifies our discussion. We can learn almost everything 
about the force F 12 by fixing r 2 at any convenient point. In particular, let us temporarily 
fix r 2 at the origin, in which case (4.78) reduces to just F 12 (r!). (This maneuver 
amounts to translating both particles until particle 2 is at the origin, and we know 
that the force is unaffected by any such translation.) With r 2 fixed, our discussion of 
the force on a single particle now applies. For example, if the force F 12 on particle 1 
is to be conservative, then it must satisfy 


Vj x F 12 = 0 


(4.79) 



Figure 4.19 If r, — r 2 = Sj — s 2 , then two particles at 
rj and r 2 could be bodily translated to Sj and s 2 without 
affecting their relative positions. This means that the force 
between the particles at rj and r 2 must be the same as that 
at Sj and s 2 . 
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where Vj is the differential operator 


„ „ 3 „ a .9 

Vi = x-h y-b z— 

9xj 9yj dz 1 

with respect to the coordinates (jq, y b z x ) of particle 1. If (4.79) is satisfied, we can 
define a potential energy U (r,) such that the force on particle 1 is 


f 12 = -v,i/( ri ). 

This gives the force F 12 for the case that particle 2 is at the origin. To find it for particle 
2 anywhere else we have only to translate back to an arbitrary position by replacing 
rj with r, — r 2 to give 


F 12 = —Vj(/ (r, - r 2 ). (4.80) 

Notice that I don’t have to change the operator V,, since an operator like 9/9x, is 
unchanged by addition of a constant to ij. 

To find the reaction force F 21 on particle 2, we have only to invoke Newton’s third 
law, which says that F 21 = — F 12 . That is, we have only to change the sign of (4.80). 
We can re-express this by noticing that 


V,I/(r, - r 2 ) = -V 2 f/(r, - r 2 ), (4.81) 

where V 2 denotes the gradient with respect to the coordinates of particle 2. (To prove 
this, invoke the chain rule. See Problem 4.50.) So, instead of changing the sign of 
(4.80) to find F 21 , we can simply replace V, by V 2 to give 


F 21 = -V 2 jy(rj - r 2 ). (4.82) 

Equations (4.80) and (4.82) are a beautiful result that generalizes to multiparticle 
systems. To emphasize what they say, let me rewrite them as 

(Force on particle 1) = -V,t/ | 

(Force on particle 2) = —’ V 2 U. j ^ 1 ' 

There is a single potential energy function U, from which we can derive both forces. 
To find the force on particle 1, we just take the gradient of U with respect to the 
coordinates of particle 1; to find the force on particle 2, we take the gradient with 
respect to the coordinates of particle 2. 

Before generalizing this result to multiparticle systems, let us consider the conser¬ 
vation of energy for our two-particle system. Figure 4.20 shows the orbits of the two 
particles. During a short time interval dt, particle 1 moves through dr ] and particle 2 
through dr 2 , and work is done on both particles by the corresponding forces. By the 
work-KE theorem 


dT { = (work on 1) = dr x • F 12 


and similarly 


dT 2 = (work on 2) = dr 2 • F 21 . 
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Figure 4.20 Motion of two interacting particles. During a 
short time interval dt, particle 1 moves from r, to r, + dr l 
and particle 2 from r 2 to r 2 + dr 2 . 


Adding these, we find for the change in the total kinetic energy T = 7j + T 2 , 
dT = dT\ + dT 2 = (work on 1) + (work on 2) 

= W tot (4.84) 


where 


W w — dr 1 • F 12 + dr 2 - F 2 i 

denotes the total work done on both particles. Replacing F 2 i by — F 12 and then 
replacing F l2 with (4.80), we can rewrite Wt ot as 


W tot = (dr, - dr§ • F 12 = d (rj - r 2 ) • [-Vj - r 2 )]. (4.85) 

If we rename (r ( — r 2 ) as r, then the right side of this equation can be seen to be just 
(minus) the change in the potential energy, and we find that 13 

W tot = -dr -VU(r) = -dU (4.86) 

where the last step follows from the property (4.35) of the gradient operator. It is 
worth pausing to appreciate this important result. The total work W tot is the sum of 
two terms, the work done by F 12 as particle 1 moves through dr l plus the work done 
by F 21 as particle 2 moves through dr 2 . According to (4.86), the potential energy U 
takes both of these terms into account and W tot is simply —dU. 

Returning to the total kinetic energy, we now see that according to (4.84) the change 
dr is just — dU. Moving the term dU to the other side, we conclude that 

d(T + U) = 0. 

That is, the total energy, 


E = T + U = 7j + T 2 + U, 


(4.87) 


13 If you invoke the chain rule for differentiation, you can see that it makes no difference whether 
we write V, U(r) or V[/(r). 
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of our two-particle system is conserved. Note well that the total energy of our two 
particles contains two kinetic terms (of course), but only one potential term, since U 
accounts for the work done by both of the forces F 12 and F 21 . 


Elastic Collisions 

Elastic collisions give a simple application of these ideas. An elastic collision is a 
collision between two particles (or bodies that can be treated as particles) that interact 
via a conservative force that goes to zero as their separation - r 2 increases. Since 
the force goes to zero as |rj - r 2 | -> oo, the potential energy U (r, — r 2 ) approaches 
a constant, which we may as well take to be zero. For example, the two particles could 
be an electron and a proton, or they could be two billiard balls. That the force between 
two billiard balls is conservative is not obvious, but it is a fact that billiard balls are 
manufactured so that they behave like almost perfect (that is, conservative) springs 
when they are forced together. It is certainly easy to think of other objects (such as 
lumps of putty), for which the interobject force is nonconservative, and the collisions 
of such objects are not elastic. 

In a collision, the two particles start out far apart, approach one another, and then 
move apart again. Because the forces are conservative, the total energy is conserved; 
that is, T + U = constant (where, of course, T = 7j + T 2 ). But when the particles are 
far apart, U is zero. Thus if we use the subscripts “in” and “fin” to label the situations 
well before and well after the particles come together, then conservation of energy 
implies that 


r in = r fin . (4.88) 

In other words, an elastic collision can be characterized as a collision in which two 
particles come together and re-emerge with their total kinetic energy unchanged. 
However, it is important to remember that there is no principle of conservation of 
kinetic energy. On the contrary, while the particles are close together their PE is 
nonzero and their KE certainly is changing. It is only when they are well separated 
that the PE is negligible and conservation of energy leads to the result (4.88). 

The foregoing discussion may suggest that elastic collisions should be a very com¬ 
mon occurence. All that is needed is two particles whose interaction is conservative. 
In practice, elastic collisions are not as widespread as this seems to imply. The trouble 
comes from the requirement that it be two particles that enter and leave the collision. 
For example, if we fire one billiard ball at a second with sufficient energy, the two balls 
may shatter. Similarly, if we fire an electron with sufficient energy at an atom, the atom 
may fall apart or, at least, change the internal motion of its constituents. Even in the 
collision of two genuine particles, such as an electron and a proton, relativity tells 
us that, with sufficient energy, new particles can be created. Clearly, at high enough 
energy, the assumption that the two objects entering a collision can be approximated 
as indivisible particles eventually breaks down, and we cannot assume that collisions 
will be elastic, even if all the underlying forces are conservative. Nevertheless, at rea¬ 
sonably low energies there are many situations where collisions are perfectly elastic: 
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At sufficiently low energy, collisions of an electron with an atom always are, and to a 
good approximation, the same is true of billiard balls. 

Elastic collisions" provide several simple illustrations of the uses of conservation 
of energy and momentum, of which the following is one. 


example 4.8 An Equal-Mass, Elastic Collision 

Consider an elastic collision between two particles of equal mass, m x = m 2 = m 
(for example, two electrons, or two billiard balls), as shown in Figure 4.21. Prove 
that if particle 2 is initially at rest then the angle between the two outgoing 
velocities is 9 = 90°. 

Conservation of momentum implies that mv, — m\' { + m\' 2 or 

v, - v; + v'. (4.89) 

That the collision is elastic implies that jmv 2 = t/nv' 2 + \mv' 2 or 


Squaring (4.89), we find that 

vfipvf + 2v;-v' +v' 2 , 
and comparing the last two equations we see that 
v'j • v' = °; 

that is, v'j and \ f 2 are perpendicular (unless one of them is zero, in which case the 
angle between them is undefined). This result was useful in atomic and nuclear 
physics; when an unknown projectile hit a stationary target particle, the fact that 
the two emerged traveling at 90° was taken as evidence that the collision was 
elastic and the two particles had equal masses. 



Figure 4.21 Elastic collision between two equal- 
mass particles. Particle 1 enters with velocity Vj and 
collides with the stationary particle 2. The angle 
between the two final velocities vj and \' 2 is 6. 
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4.10 The Energy of a Multiparticle System 


We can extend our discussion of two particles to N particles fairly easily. The main 
complication is notational: The large number of ^ signs can make it hard to see 
clearly what is going on. For this reason, I shall start by considering the case of four 
particles (N = 4) and write out all of the various sums explicitly. 


Four Particles 

Let us consider, then, four particles, as shown in Figure 4.22. The particles can interact 
with each other (for example, they could be charged, so that each particle experiences 
the Coulomb force from the three others), and they may be subject to external forces, 
such as gravity or the Coulomb force of nearby charged bodies. In defining the energy 
of this system, the easy part is the kinetic energy T, which is, of course, the sum of 
four terms. 


r = 7\ + r 2 + r 3 + r 4 , (4.90) 

one term T a = {m a v^ for each particle. 

To define the potential energy, we must examine the forces on the particles. First, 
there are the internal forces of the four particles interacting with each other. For each 
pair of particles there is an action-reaction pair of forces; for example, particles 3 
and 4 produce the forces F 34 and F 43 shown in Figure 4.22. 1 shall take for granted 
that each of these interparticle forces F a/S is unaffected by the presence of the other 
particles and any external bodies. For example, F 34 is just the same as if particles 1 



1 * 

2 

Figure 4.22 A system of four particles a = 1, 2, 3,4. For 
each pair of particles, aft, there is an action-reaction 
pair of forces, F ajS and F^, such as the pair F 34 and 
F 43 shown. In addition, each particle a may be subject 
to an external net force F® xt . The four particles could be 
charged dust motes floating in the air, with the forces V* 
being electrostatic and F® xt being gravity plus buoyancy 
of the air. 



Section 4.10 The Energy of a Multiparticle System 


145 


and 2 and all external bodies were removed. 14 Thus, we can treat the two forces F 3 4 
and F 43 exactly as in Section 4.9. Provided the forces are conservative, we can define 
a potential energy 


^34 — ^34( r 3 - r 4) 


(4.91) 


and the corresponding forces are the appropriate gradients as in (4.83) 

F 34 = -V 3 C/ 34 and F 43 = -V 4 C/ 34 . (4.92) 

There are in all six distinct pairs of particles, 12, 13, 14, 23, 24, 34, and for each 
pair we can define a corresponding potential energy U u , • • •, U 34 from which the 
corresponding forces are obtained in the same way. 

Each of the external forces F® xt depends only on the corresponding position r a . 
(The force F® xt , for instance, depends on the position r 1? but not on r 2 , r 3 , r 4 .) 
Therefore, we can handle F® xt exactly as we did the force on a single particle. In 
particular, if F® xt is conservative, we can introduce a potential energy I/® xt (r a ) and 
the corresponding force is given by 

F® xt = -V a £/® xt ( rj (4.93) 

where, of course, V a denotes differentiation with respect to the coordinates of parti¬ 
cle a. 

We can now put all the potential energies together and define the total potential 
energy as the sum 

U m U int + U tn = (f/ 12 + U l3 + U l4 + U 23 + U 24 + U 34 ) 

+ (C/j ext + c/ 2 ext + Uf + t/ 4 ext ). (4.94) 

In this definition, U mt is the sum over all six pairs of particles of the pairwise 
potential energies, U n , ■ • •, U 34 , and U ext is the sum of the four potential energies, 
I/] ext , • • •, {/® xt arising from the external forces. 

It is a fairly straightforward matter to show (see Problem 4.51 for more details) 
that the force on particle a is just (minus) the gradient of U with respect to the 
coordinates ( x a , y a , z a ). Consider, for instance, the gradient — Vj U. When —Vj acts 
on the first line of (4.94), its action on the first three terms, U n + U l3 + U l4 gives 
precisely the three internal forces, Fj 2 + F 13 + F 14 . Acting on the last three terms, 
U 23 + U 24 -|- U 34 , it produces zero, since none of these depend on rj. When —V, acts 
on the second line of (4.94), its action on the first term, C/® xt , produces the external 


14 This is quite a subtle point. I am, of course, not denying that the extra particles exert extra 
forces on particle 3. I claim only that the force of particle 4 on particle 3 is independent of the 
presence or absence of particles 1 and 2 and any external bodies. One could imagine a world where 
this claim was false (the presence of particle 1 could somehow change the force of 4 on 3), but 
experiment seems to confirm that in our world my claim is true. 
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force F® xt . Acting on the last three terms it produces zero, since none of them depend 
on i*!- Accordingly, 

-V l C/=F 12 + F 13 + F 14 + Ff t 

= (net force on particle 1). (4.95) 

In exactly the same way, we can prove that in general 


~V a U ~ (net force on particle a) (4.96) 


as expected. 

The second crucial property of our definition of potential energy U is that (provided 
all the forces concerned are conservative, so we can define U), the total energy, defined 
as E = T + U, is conserved. We prove this in the now familiar way (for more details, 
see Problem 4.52): Apply the work-KE theorem to each of the four particles and add 
the results to show that, in any short time interval, dT = W tot where W tot denotes 
the total work done by all forces on all particles. Next show that Wt ot = —dU, and 
conclude that dT = —dU, and hence 

dE = dT + dU = 0. 


That is, energy is conserved. 


N Particles 

The extension of these ideas to an arbitrary number of particles is now quite straight¬ 
forward, and I shall just write down the principal formulas. For N particles, labeled 
a = 1, • • •, N, the total kinetic energy is just the sum of the N separate kinetic energies 

T = Y. T <* = H\ m « V a- 

Assuming that all forces are conservative, for each pair of particles, a(3, we introduce 
the potential energy U a p that describes their interaction, and for each particle a we 
introduce the potential energy £/® xt that describes the net external force on that particle. 
The total potential energy is then 

U = U int + C/ ext = Y t J2u a p + '%2 Uf. (4.97) 

(Here the condition > a in the double sum makes sure we don’t double count the 
internal interactions U a p. For instance, we include U l2 but not U 2 \-) 

With the potential energy U defined in this way, the net force on any particle a is 
given by — V a U, as in Equation (4.96), and total energy E = T + U is conserved. 
Finally, if any forces are nonconservative, we can define U as the potential energy 
pertaining to the conservative forces and then show that, in this case, dE = W nc where 
W nc is the work done by the nonconservative forces. 
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Rigid Bodies 

While the formalism of the last two sections is fairly general and complicated, you can 
perhaps take some comfort that most applications of the formalism are much simpler 
than the formalism itself. As one simple example, consider a rigid body, such as a 
golf ball or a meteorite, made up of N atoms. The number N is typically very large, 
but the energy formalism just developed usually turns out to be very simple. As you 
probably recall from elementary physics, the total kinetic energy of the N particles 
rigidly bound together is just the kinetic energy of the center-of-mass motion plus 
the kinetic energy of rotation. (I’ll be proving this in Chapter 10, but I hope you’ll 
accept it for now.) The potential energy of the internal, interatomic forces as given by 
(4.97) is 

(4 - 98) 

a p>a 

If the interatomic forces are central (as is usually the case), then, as we saw in Section 
4.8, the potential energy U a p actually depends on just the magnitude of r a — (not 
its direction). Thus we can rewrite (4.98) as 

-■>!>• < 4 -"> 

a fi>a 

Now, as a rigid body moves, the positions r a of its constituent atoms can, of course, 
move, but the distance |r a — r^\ between any two atoms cannot change. (This is, in 
fact, the definition of a rigid body.) Therefore, if the body concerned is truly rigid, none 
of the terms in (4.99) can change. That is, the potential energy U int of the internal forces 
is a constant and can, therefore, be ignored. Thus, in applying energy considerations 
to a rigid body we can entirely ignore U mt and have to worry only about the energy 
C/ ext corresponding to the external forces. Since this latter energy is often a very simple 
function (see the following example), energy considerations as applied to a rigid body 
are usually very straightforward. 


example 4.9 A Cylinder Rolling down an Incline 

A uniform rigid cylinder of radius R rolls without slipping down a sloping track 
as shown in Figure 4.23. Use energy conservation to find its speed v when it 
reaches a vertical height h below its point of release. 

In accordance with the preceding discussion we can ignore the internal forces 
that hold the cylinder together. The external forces on the cylinder are the normal 
and frictional forces of the track and gravity. The first two do no work, and 
gravity is conservative. As you certainly recall from introductory physics, the 
gravitational potential energy of an extended body is the same as if all the mass 
were concentrated at the center of mass. (See Problem 4.6.) Therefore, 

U ext = MgY, 

where Y is the height of the cylinder’s CM measured up from any convenient 
reference level. The kinetic energy of the cylinder is T — \Mv 2 + \lor, where 
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Figure 4.23 A uniform cylinder starts from rest and rolls 
without slipping down a slope through a total vertical 
drop h = Yin - Ynn (with the CM coordinate Y measured 
vertically up). 


I is its moment of inertia, I = \MR 2 , and a> is its angular velocity of rolling, 
(o = v/R. Thus the final kinetic energy is 

T = | Mv 2 

and the initial KE is zero. Therefore, conservation of energy in the form A T — 
—Af/ ext implies that 

f Mv 2 = -Mg(F fin - YJ = Mgh 


and hence that the final speed is 



Principal Definitions and Equations of Chapter 4 _ 

Work-KE Theorem 

The change in KE of a particle as it moves from point 1 to point 2 is 

AT = r 2 - 7\ = ^ F - dr = W(1 2) [Eq. (4.7)] 

where T = \mv 2 and W(1 2) is the work which is done by the total force F on the 

particle and is defined by the preceding integral. 
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Conservative Forces and Potential Energy 

A force F on a particle is conservative if (i) it depends only on the particle’s position, 
F = F(r), and (ii) for any two points 1 and 2, the work W( \ -> 2) done by F is the 
same for all paths joining 1 and 2 (or equivalently, V x F = 0). [Sections 4.2 & 4.4] 
If F is conservative, we can define a corresponding potential energy so that 


and 


t/(r) = — W(r 0 -> r) = 


F(r') 'dr' 


[Eq. (4.13)] 


F = -VI/. [Eq. (4.33)] 

If all the forces on a particle are conservative with corresponding potential energies 
Up ■ ■ •, U n , then the total mechanical energy 

E = T + U t + <+ U n [Eq. (4.22)] 

is constant. More generally if there are also nonconservative forces, A E = W nc , the 
work done by the nonconservative forces. 


Central Forces 

A force F(r) is central if it is everywhere directed toward or away from a “force 
center.” If we take the latter to be the origin, 

F(r) = /(r)r. [Eq. (4.65)] 

A central force is spherically symmetric [/(r) = f(r)] if and only if it is conservative. 

[Sec. (4.8)] 


Energy of a Multiparticle System 

If all forces (internal and external) on a multiparticle system are conservative, the total 
potential energy, 


satisfies 


u = u™ + u m = EE^ + E u ? 

a fi>a a 


(net force on particle a) = — W a U 


[Eq. (4.97)] 


[Eq. (4.96)] 


and 


T + U = constant. 


[Problem 4.52] 
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Problems for Chapter 4 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (***). 


section 4.1 Kinetic Energy and Work 

4.1 ★ By writing a • b in terms of components prove that the product rule for differentiation applies to 
the dot product of two vectors; that is, 


d . .. da . , db 

— (a*b) = — b + a — 
dt dt dt 


4.2 ** Evaluate the work done 

W = J F-dr = f (F x dx + F y dy) (4.100) 

by the two-dimensional force F = (x 2 , 2 xy) along the three paths joining the origin to the point 
P = (1, 1) as shown in Figure 4.24(a) and defined as follows: (a) This path goes along the x axis 
to Q = (1, 0) and then straight up to P. (Divide the integral into two pieces, + Jq.) (b) On 

this path y = x 2 , and you can replace the term dy in (4.100) by dy = 2x dx and convert the whole 
integral into an integral over x. (c) This path is given parametrically as x = t 3 , y = t 2 . In this case 
rewrite x, y, dx , and dy in (4.100) in terms of t and dt, and convert the integral into an integral over t. 

4.3 ** Do the same as in Problem 4.2, but for the force F = (— y, x) and for the three paths joining P 
and Q shown in Figure 4.24(b) and defined as follows: (a) This path goes straight from P = (1, 0) to 
the origin and then straight to Q = (0, 1). (b) This is a straight line from P to Q. (Write y as a function 
of x and rewrite the integral as an integral over x.) (c) This is a quarter-circle centered on the origin. 
(Write x and y in polar coordinates and rewrite the integral as an integral over (p.) 

4.4 ** A particle of mass m is moving on a frictionless horizontal table and is attached to a massless 
string, whose other end passes through a hole in the table, where I am holding it. Initially the particle is 
moving in a circle of radius r Q with angular velocity co Q , but I now pull the string down through the hole 
until a length r remains between the hole and the particle, (a) What is the particle’s angular velocity 
now? (b) Assuming that I pull the string so slowly that we can approximate the particle’s path by a 




Figure 4.24 (a) Problem 4.2. (b) Problem 4.3 
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circle of slowly shrinking radius, calculate the work I did pulling the string, (c) Compare your answer 
to part (b) with the particle’s gain in kinetic energy. 


section 4.2 Potential Energy and Conservative Forces 

4.5 * (a) Consider a mass m in a uniform gravitational field g, so that the force on m is mg, where g 
is a constant vector pointing vertically down. If the mass moves by an arbitrary path from point 1 to 
point 2, show that the work done by gravity is Wg rav (l -> 2) = —mgh. where h is the vertical height 
gained between points 1 and 2. Use this result to prove that the force of gravity is conservative (at least 
in a region small enough so that g can be considered constant), (b) Show that, if we choose axes with 
y measured vertically up, the gravitational potential energy is U = mgy (if we choose U = 0 at the 
origin). 

4.6 * For a system of N particles subject to a uniform gravitational field g acting vertically down, prove 
that the total gravitational potential energy is the same as if all the mass were concentrated at the center 
of mass of the system; that is, 


U = J2 U a = MgY 


where M = m a is the total mass and R = ( X , Y, Z ) is the position of the CM, with the y coordinate 
measured vertically up. [Hint: We know from Problem 4.5 that U a = m a gy a .\ 

4.7 * Near to the point where I am standing on the surface of Planet X, the gravitational force on a 
mass m is vertically down but has magnitude myy 2 where y is a constant and y is the mass’s height 
above the horizontal ground, (a) Find the work done by gravity on a mass m moving from to r 2 , and 
use your answer to show that gravity on Planet X, although most unusual, is still conservative. Find the 
corresponding potential energy, (b) Still on the same planet, I thread a bead on a curved, frictionless, 
rigid wire, which extends from ground level to a height h above the ground. Show clearly in a picture 
the forces on the bead when it is somewhere on the wire. (Just name the forces so it’s clear what they 
are; don’t worry about their magnitude.) Which of the forces are conservative and which are not? (c) If 
I release the bead from rest at a height h, how fast will it be going when it reaches the ground? 

4.8 ** Consider a small frictionless puck perched at the top of a fixed sphere of radius R. If the puck 
is given a tiny nudge so that it begins to slide down, through what vertical height will it descend before 
it leaves the surface of the sphere? [Hint: Use conservation of energy to find the puck’s speed as a 
function of its height, then use Newton’s second law to find the normal force of the sphere on the puck. 
At what value of this normal force does the puck leave the sphere?] 

4.9 ** (a) The force exerted by a one-dimensional spring, fixed at one end, is F = —kx, where x is the 
displacement of the other end from its equilibrium position. Assuming that this force is conservative 
(which it is) show that the corresponding potential energy is U = \kx 2 , if we choose U to be zero at 
the equilibrium position, (b) Suppose that this spring is hung vertically from the ceiling with a mass m 
suspended from the other end and constrained to move in the vertical direction only. Find the extension 
x 0 of the new equilibrium position with the suspended mass. Show that the total potential energy (spring 
plus gravity) has the same form \ky 2 if we use the coordinate y equal to the displacement measured 
from the new equilibrium position at x =x 0 (and redefine our reference point so that U = 0 at y = 0). 
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section 4.3 Force as the Gradient of Potential Energy 

4.10* Find the partial derivatives with respect to x, y, and z of the following functions: (a) f(x, y, z) = 
ax 2 + bxy + cy 2 , (b) g(x, y, z) = sin(axyz 2 ), (c)h(x, y, z) = ae xy/z , where a, b, and c are constants. 
Remember that to evaluate df/dx you differentiate with respect to x treating y and z as constants. 

4.11 * Find the partial derivatives with respect to x, y, and z of the following functions: (a) fix, y, z) — 
ay 2 + 2byz + cz 2 , (b) g(x, y, z) = cos (axy 2 z 3 ), (c) h(x, y, z) = ar, where a, b, and c are constants 
and r = y/x 2 + y 2 + z 2 . Remember that to evaluate df/dx you differentiate with respect to x treating 
y and z as constants. 

4.12 * Calculate the gradient V/ of the following functions, f(x, y, z): (a) f = x 2 + z 3 . (b) / = ky, 
where k is a constant, (c) / = r = y/x 2 + y 2 + z 2 . [Hint: Use the chain rule.] (d) / = 1/r. 

4.13* Calculate the gradient V/ of the following functions, f(x, y, z): (a) / = ln(r), (b) / = r n , 
(c) / = g(r), where r = y/x 2 + y 2 + z 2 and g(r) is some unspecified function of r. [Hint: Use the 
chain rule.] 

4.14 * Prove that if /( r) and g(r) are any two scalar functions of r, then 

V(/g) = fVg + gVf 

4.15 * For /(r) = x 2 + 2 y 2 + 3z 2 , use the approximation (4.35) to estimate the change in / if we 
move from the point r = (1, 1, 1) to (1.01, 1.03,1.05). Compare with the exact result. 

4.16* If a particle’s potential energy is U (r) = k(x 2 + y 2 + z 2 ), where k is a constant, what is the 
force on the particle? 

4.17 ★ A charge q in a uniform electric field E 0 experiences a constant force F = qE 0 . (a) Show that this 
force is conservative and verify that the potential energy of the charge at position r is V (r) = — qE a • r. 
(b) By doing the necessary derivatives, check that F = — VU. 

4.18 ** Use the property (4.35) of the gradient to prove the following important results: (a) The vector 
V/ at any point r is perpendicular to the surface of constant / through r. (Choose a small displacement 
dr that lies in a surface of constant /. What is df for such a displacement?) (b) The direction of V/ 
at any point r is the direction in which / increases fastest as we move away from r. (Choose a small 
displacement dr = eu, where u is a unit vector and e is fixed and small. Find the direction of u for 
which the corresponding df is maximum, bearing in mind that a • b = ab cos 9.) 

4.19 ** (a) Describe the surfaces defined by the equation / = const, where / = x 2 + 4y 2 . (b) Using 
the results of Problem 4.18, find a unit normal to the surface / = 5 at the point (1, 1, 1). In what direction 
should one move from this point to maximize the rate of change of /? 

section 4.4 The Second Condition that F be Conservative 

4.20* Find the curl, V x F, for the following forces: (a) F = kr\ (b) F = (Ax, By 2 , Cz 3 )-, (c) F = 
(Ay 2 , Bx, Cz), where A, B,C and k are constants. 

4.21 * Verify that the gravitational force — GMmr/r 2 on a point mass m at r, due to a fixed point mass 
M at the origin, is conservative and calculate the corresponding potential energy. 

4.22* The proof in Example 4.5 (page 119) that the Coulomb force is conservative is considerably 
simplified if we evaluate V x F using spherical polar coordinates. Unfortunately, the expression for 
V x F in spherical polar coordinates is quite messy and hard to derive. However, the answer is given 
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inside the back cover, and the proof can be found in any book on vector calculus or mathematical 
methods. 15 Taking the expression inside the back cover on faith, prove that the Coulomb force F = 
yr/r 2 is conservative. 

4.23** Which of the following forces is conservative? (a) F = k(x, 2y, 3z) where k is a constant, 
(b) F = k(y, x, 0). (c) F = k(—y, x, 0). For those which are conservative, find the corresponding 
potential energy U, and verify by direct differentiation that F = —Vt/. 

4.24 *** An infinitely long, uniform rod of mass /x per unit length is situated on the z axis, (a) Calculate 
the gravitational force F on a point mass m at a distance p from the z axis. (The gravitational force 
between two point masses is given in Problem 4.21.) (b) Rewrite F in terms of the rectangular 
coordinates (x,y,z) of the point and verify that V x F = 0. (c) Show that V x F = 0 using the 
expression for V x F in cylindrical polar coordinates given inside the back cover, (d) Find the 
corresponding potential energy U. 

4.25 *** The proof that the condition V x F = 0 guarantees the path independence of the work 
f 2 F • dr done by F is unfortunately too lengthy to be included here. However, the following three 
exercises capture the main points: 16 (a) Show that the path independence of /j 2 F • dr is equivalent to 
the statement that the integral $ r F • dr around any closed path T is zero. (By tradition, the symbol § is 
used for integrals around a closed path — a path that starts and stops at the same point.) [Hint: For any 
two points 1 and 2 and any two paths from 1 to 2, consider the work done by F going from 1 to 2 along 
the first path and then back to 1 along the second in the reverse direction.] (b) Stokes’s theorem asserts 
that F • dr = /(V x F) -ndA, where the integral on the right is a surface integral over a surface for 
which the path T is the boundary, and n and d A are a unit normal to the surface and an element of area. 
Show that Stokes’s theorem implies that if V xF = 0 everywhere, then F • dr = 0. (c) While the 
general proof of Stokes’s theorem is beyond our scope here, the following special case is quite easy to 
prove (and is an important step toward the general proof): Let T denote a rectangular closed path lying 
in a plane perpendicular to the z direction and bounded by the lines x = B,x = B + b, y = C and 
y = C + c. For this simple path (traced counterclockwise as seen from above), prove Stokes’s theorem 
that 

j) F -dr = J(W x F )-ndA 

where n = z and the integral on the right runs over the flat, rectangular area inside T. [Hint: The 
integral on the left contains four terms, two of which are integrals over x and two over y. If you 
pair them in this way, you can combine each pair into a single integral with an integrand of the form 
F x (x, C + c, z) — F x (x, C, z) (or a similar term with the roles of jc and y exchanged). You can rewrite 
this integrand as an integral over y of c)F x (x, y, z)/dy (and similarly with the other term), and you’re 
home.] 

section 4.5 Time-Dependent Potential Energy 

4.26 ★ A mass m is in a uniform gravitational field, which exerts the usual force F = mg vertically 
down, but with g varying with time, g = g(t). Choosing axes with y measured vertically up and defining 
U = mgy as usual, show that F = — VC/ as usual, but, by differentiating E = \mv 2 + U with respect 
to t, show that E is not conserved. 

15 See, for example, Mathematical Methods in the Physical Sciences by Mary Boas (Wiley, 1983), p. 435. 

16 For a complete discussion see, for example, Mathematical Methods, Boas, Ch. 6 , Sections 8-11. 
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4.27 ** Suppose that the force F(r, t) depends on the time t but still satisfies V x F = 0. It is a 
mathematical fact (related to Stokes’s theorem as discussed in Problem 4.25) that the work integral 
/j 2 F(r, t ) ‘dr (evaluated at any one time t) is independent of the path taken between the points 1 
and 2. Use this to show that the time-dependent PE defined by (4.48), for any fixed time t, has the 
claimed property that F(r, t) = — Vt/(r, t). Can you see what goes wrong with the argument leading 
to Equation (4.19), that is, conservation of energy? 

section 4.6 Energy for Linear One-Dimensional Systems 

4.28 ★★ Consider a mass m on the end of a spring of force constant k and constrained to move along 
the horizontal x axis. If we place the origin at the spring’s equilibrium position, the potential energy 
is ~kx 2 . At time t = 0 the mass is sitting at the origin and is given a sudden kick to the right so that 
it moves out to a maximum displacement at x max = A and then continues to oscillate about the origin, 
(a) Write down the equation for conservation of energy and solve it to give the mass’s velocity x in 
terms of the position x and the total energy E. (b) Show that E = \kA 2 , and use this to eliminate E 
from your expression for x. Use the result (4.58), t = / dx'/x(x'), to find the time for the mass to move 
from the origin out to a position x. (c) Solve the result of part (b) to give x as a function of t and show 
that the mass executes simple harmonic motion with period 2n *Jm/k. 

4.29 ** [Computer] A mass m confined to the x axis has potential energy U — kx 4 with k > 0. 
(a) Sketch this potential energy and qualitatively describe the motion if the mass is initially stationary 
at x = 0 and is given a sharp kick to the right at t = 0. (b) Use (4.58) to find the time for the mass to 
reach its maximum displacement x max = A. Give your answer as an integral over x in terms of m, A, 
and k. Hence find the period r of oscillations of amplitude A as an integral, (c) By making a suitable 
change of variables in the integral, show that the period t is inversely proportional to the amplitude A. 
(d) The integral of part (b) cannot be evaluated in terms of elementary functions, but it can be done 
numerically. Find the period for the case that m = k = A — 1. 

section 4.7 Curvilinear One-Dimensional Systems 

4.30* Figure 4.25 shows a child’s toy, which has the shape of a cylinder mounted on top of a 
hemisphere. The radius of the hemisphere is R and the CM of the whole toy is at a height h above 
the floor, (a) Write down the gravitational potential energy when the toy is tipped to an angle 0 from 
the vertical. [You need to find the height of the CM as a function of 6. It helps to think first about the 
height of the hemisphere’s center O as the toy tilts.] (b) For what values of R and h is the equilibrium 
at 9 = 0 stable? 
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4.31 * (a) Write down the total energy E of the two masses in the Atwood machine of Figure 4.15 in 
terms of the coordinate x and x. (b) Show (what is true for any conservative one-dimensional system) 
that you can obtain the equation of motion for the coordinate x by differentiating the equation E = 
const. Check that the equation of motion is the same as you would obtain by applying Newton’s second 
law to each mass and eliminating the unknown tension from the two resulting equations. 

4.32** Consider the bead of Figure 4.13 threaded on a curved rigid wire. The bead’s position is 
specified by its distance 5, measured along the wire from the origin, (a) Prove that the bead’s speed v is 
just vmi. (Write v in terms of its components, dx/dt, etc., and find its magnitude using Pythagoras’s 
theorem.) (b) Prove that ms = F tang , the tangential component of the net force on the bead. (One way 
to do this is to take the time derivative of the equation v 2 = v • v. The left side should lead you to s and 
the right to F tang .) (c) One force on the bead is the normal force N of the wire (which constrains the 
bead to stay on the wire). If we assume that all other forces (gravity, etc.) are conservative, then their 
resultant can be derived from a potential energy U. Prove that F tang = —dU/ds. This shows that one¬ 
dimensional systems of this type can be treated just like linear systems, with x replaced by s and F x 
by F tan g. 

4.33 ** [Computer] (a) Verify the expression (4.59) for the potential energy of the cube balanced on 
a cylinder in Example 4.7 (page 130). (b) Make plots of U ( 6 ) for b - 0.9 r and b — 1. lr. (You may as 
well choose units such that r, m, and g are all equal to 1.) (c) Use your plots to confirm the findings 
of Example 4.7 concerning the stability of the equilibrium at 6 = 0. Are there any other equilibrium 
points and are they stable? 

4.34 ** An interesting one-dimensional system is the simple pendulum, consisting of a point mass m, 
fixed to the end of a massless rod (length /), whose other end is pivoted from the ceiling to let it swing 
freely in a vertical plane, as shown in Figure 4.26. The pendulum’s position can be specified by its angle 
0 from the equilibrium position. (It could equally be specified by its distance .v from equilibrium — 
indeed s = l(j> —but the angle is a little more convenient.) (a) Prove that the pendulum’s potential energy 
(measured from the equilibrium level) is 

t/(0) = mgl( 1 - cos 0). , (4.101) 

Write down the total energy F as a function of 0 and <p. (b) Show that by differentiating your expression 
for E with respect to t you can get the equation of motion for 0 and that the equation of motion is just 
the familiar T = la (where T is the torque, I is the moment of inertia, and a is the angular acceleration 
0). (c) Assuming that the angle 0 remains small throughout the motion, solve for 0(0 and show that 
the motion is periodic with period 


= 2 Tty/ljg. 


(4.102) 
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Figure 4.27 Problem 4.36 


(The subscript “o” is to emphasize that this is the period for small oscillations.) 

4.35** Consider the Atwood machine of Figure 4.15, but suppose that the pulley has radius R and 
moment of inertia I. (a) Write down the total energy of the two masses and the pulley in terms of the 
coordinate x and x. (Remember that the kinetic energy of a spinning wheel is \Ico 2 .) (b) Show (what 
is true for any conservative one-dimensional system) that you can obtain the equation of motion for the 
coordinate x by differentiating the equation E = const. Check that the equation of motion is the same 
as you would obtain by applying Newton’s second law separately to the two masses and the pulley, and 
then eliminating the two unknown tensions from the three resulting equations. 

4.36 ** A metal ball (mass m ) with a hole through it is threaded on a frictionless vertical rod. A 
massless string (length l) attached to the ball runs over a massless, frictionless pulley and supports a 
block of mass M, as shown in Figure 4.27. The positions of the two masses can be specified by the 
one angle 6. (a) Write down the potential energy U (6). (The PE is given easily in terms of the heights 
shown as h and H. Eliminate these two variables in favor of 6 and the constants b and /. Assume that 
the pulley and ball have negligible size.) (b) By differentiating U (6) find whether the system has an 
equilibrium position, and for what values of m and M equilibrium can occur. Discuss the stability of 
any equilibrium positions. 

4.37 [Computer] Figure 4.28 shows a massless wheel of radius R, mounted on a frictionless, 
horizontal axle. A point mass M is glued to the edge of the wheel, and a mass m hangs from a string 
wrapped around the perimeter of the wheel, (a) Write down the total PE of the two masses as a function 
of the angle 0. (b) Use this to find the values of m and M for which there are any positions of equilibrium. 
Describe the equilibrium positions, discuss their stability, and explain your answers in terms of torques, 
(c) Plot U (0) for the cases that m = 0.7 M and m = 0.8 M, and use your graphs to describe the behavior 
of the system if I release it from rest at 0 = 0. (d) Find the critical value of m/M on one side of which 
the system oscillates and on the other side of which it does not (if released from rest at 0 = 0). 

4.38 *** [Computer] Consider the simple pendulum of Problem 4.34. You can get an expression for 
the pendulum’s period (good for large oscillations as well as small) using the method discussed in 
connection with (4.57), as follows: (a) Using (4.101) for the PE, find 0 as a function of 0. Next use 
(4.57), in the form t = f r/0/0, to write the time for the pendulum to travel from 0 = 0 to its maximum 
value (the amplitude) O. Because this time is a quarter of the period r, you can now write down the 
period. Show that 
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Figure 4.28 Problem 4.37 


i r *_ d$ _ 

n yj sin 2 (<I>/2) - sin 2 (0/2) 


2 f 1 _ du _ 

° 7t Jo -v/T - m 2 V 1 - A 2 w 2 ’ 


(4.103) 


where r 0 is the period (4.102) (Problem 4.34) for small oscillations and A = sin($/2). [To get the 
first expression you will need to use the trig identity for 1 — cos 0 in terms of sin 2 (0/2). To get the 
second you need to make the substitution sin(0/2) = Aw.] These integrals cannot be evaluated in 
terms of elementary functions. However, the second integral is a standard integral called the complete 
elliptic integral of the first kind, sometimes denoted K(A 2 ), whose values are tabulated 17 and are 
known to computer software such as Mathematica [which calls it EllipticK(A 2 )]. (b) If you have access 
to computer software that knows this function, make a plot of r/r 0 for amplitudes 0 < O < 3 rad. 
Comment. What becomes of r as the amplitude of oscillation approaches n ? Explain. 

4,39*** (a) If you have not already done so, do Problem 4.38(a). (b) If the amplitude $ is small 
then so is A = sin 0/2. If the amplitude is very small, we can simply ignore the last square root in 
(4.103). Show that this gives the familiar result for the small-amplitude period, x = t 0 = 2tc fTfg. 
(c) If the amplitude is small but not very small, we can improve on the approximation of part (b). Use 
the binomial expansion to give the approximation 1/Vl — A 2 w 2 ~ 1 + \A 2 u 2 and show that, in this 
approximation, (4.103) gives 


r = r 0 [l + \ sin 2 (0/2)]. 

What percentage correction does the second term represent for an amplitude of 45° ? (The exact answer 
for O = 45° is 1.040 r 0 to four significant figures.) 

section 4.8 Central Forces 

4.40 * (a) Verify the three equations (4.68) that give x, y, z in terms of the spherical polar coordinates 
r, 6, 0. (b) Find expressions for r, 0, 0 in terms of x, y, z. 


17 See, for example, M.Abramowitz and I.Stegun, Handbook of Mathematical Functions, Dover, New York, 
1965. Be warned that different authors use different notations. In particular, some authors call the exact same 
integral K(A). 
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Figure 4.29 Problem 4.44 


4.41 ★ A mass m moves in a circular orbit (centered on the origin) in the field of an attractive central 
force with potential energy U = kr n . Prove the virial theorem that T = nU /2. 

4.42 * In one dimension, it is obvious that a force obeying Hooke’s law is conservative (since F = —kx 
depends only on the position x, and this is sufficient to guarantee that F is conservative in one 
dimension). Consider instead a spring that obeys Hooke’s law and has one end fixed at the origin, 
but whose other end is free to move in all three dimensions. (The spring could be fastened to a point in 
the ceiling and be supporting a bouncing mass m at its other end, for instance.) Write down the force 
F(r) exerted by the spring in terms of its length r and its equilibrium length r 0 . Prove that this force is 
conservative. [Hints: Is the force central? Assume that the spring does not bend.] 

4.43 ** In Section 4.8,1 claimed that a force F(r) that is central and spherically symmetric is automati¬ 
cally conservative. Here are two ways to prove it: (a) Since F(r) is central and spherically symmetric, it 
must have the form F(r) = f(r) r. Using Cartesian coordinates, show that this implies that V x F = 0. 
(b) Even quicker, using the expression given inside the back cover for V x F in spherical polars, show 
that V x F = 0. 

4.44 ★* Problem 4.43 suggests two proofs that a central, spherically symmetric force is automatically 
conservative, but neither proof makes really clear why this is so. Here is a proof that is less complete but 
more insightful: Consider any two points A and B and two different paths ACB and ADB connecting 
them as shown in Figure 4.29. Path ACB goes radially out from A until it reaches the radius r B of B, 
and then around a sphere (center O) to B. Path ADB goes around a sphere of radius r A until it reaches 
the line OB, and then radially out to B. Explain clearly why the work done by a central, spherically 
symmetric force F is the same along both paths. (This doesn’t prove that the work is the same along 
any two paths from A to B. If you want you can complete the proof by showing that any path can be 
approximated by a series of paths moving radially in or out and paths of constant r.) 

4.45 ** In Section 4.8, 1 proved that a force F(r) = /(r)r that is central and conservative is automat¬ 
ically spherically symmetric. Here is an alternative proof: Consider the two paths ACB and ADB of 
Figure 4.29, but with r B = r A + dr where dr is infinitesimal. Write down the work done by F(r) going 
around both paths, and use the fact that they must be equal to prove that the magnitude function /(r) 
must be the same at points A and D; that is, /(r) = f(r ) and the force is spherically symmetric. 
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section 4.9 Energy of Interaction of Two Particles 

4.46* Consider an elastic collision of two particles as in Example 4.8 (page 143), but with unequal 
masses, m 2 . Show that the angle 0 between the two outgoing velocities satisfies 0 <nj 2 if 
m x > m 2 , but 0 > 7r/2 if m l < m 2 . 

4.47 * Consider a head-on elastic collision between two particles. (Since the collision is head-on, the 
motion is confined to a single straight line and is therefore one-dimensional.) Prove that the relative 
velocity after the collision is equal and opposite to that before. That is, v x — v 2 = — (i/j — v' 2 ), where 
v x and v 2 are the initial velocities and v[ and v' 2 the corresponding final velocities. 

4.48* A particle of mass m x and speed c, collides with a second particle of mass m 2 at rest. If the 
collision is perfectly inelastic (the two particles lock together and move off as one) what fraction of the 
kinetic energy is lost in the collision? Comment on your answer for the cases that m x <^m 2 and that 
m 2 <£. m x . 

4.49 *★ Both the Coulomb and gravitational forces lead to potential energies of the form U = Y/\ r \ — 
r 2 |, where y denotes kq x q 2 in the case of the Coulomb force and — Gm 1 w 2 for gravity, and r t and r 2 
are the positions of the two particles. Show in detail that —V, U is the force on particle 1 and — V 2 U 
that on particle 2. 

4.50 ** The formalism of the potential energy of two particles depends on the claim in (4.81) that 


Vi U (i - ! — r 2 ) — V 2 C/ (i*! — r 2 ). 


Prove this. (Use the chain rule for differentiation. The proof in three dimensions is notationally 
awkward, so prove the one-dimensional result that 

T~f( x i - x 2> = ~T~f( x l - *2) 

OXi 0X2 

and then convince yourself that it extends to three dimensions.) 
section 4.10 The Energy of a Multiparticle System 

4.51 ** Write out the arguments of all the potential energies of the four-particle system in (4.94). 
For instance U = U (r 3 , r 2 , • • •, r 4 ), whereas t/ 34 = t/ 34 (r 3 — r 4 ). Show in detail that the net force on 
particle 3 (for instance) is given by — V 3 U. [You know that the separate forces, internal and external, 
are given by (4.92) and (4.93).] 

4.52 ** Consider the four-particle system of Section 4.10. (a) Write down the work-KE theorem for 
each of the four particles separately and, by adding these four equations, show that the change in the 
total KE in a short time interval dt is dT = W tol where W m is the total work done on all particles by 
all forces. [This shouldn’t take more than two or three lines.] (b) Next show that W tot = —dU where 
dU is the change in total PE during the same time interval. Deduce that the total mechanical energy 
E = T + U is conserved. 

4.53 ** (a) Consider an electron (charge — e and mass m ) in a circular orbit of radius r around a fixed 
proton (charge +<?). Remembering that the inward Coulomb force ke 2 / r 2 is what gives the electron its 
centripetal acceleration, prove that the electron’s KE is equal to — ± times its PE; that is, T = —\U and 
hence E = \U. (This result is a consequence of the so-called virial theorem. See Problem 4.41.) Now 
consider the following inelastic collision of an electron with a hydrogen atom: Electron number 1 is in 
a circular orbit of radius r around a fixed proton. (This is the hydrogen atom.) Electron 2 approaches 
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from afar with kinetic energy T 2 . When the second electron hits the atom, the first electron is knocked 
free, and the second is captured in a circular orbit of radius r'. (b) Write down an expression for the 
total energy of the three-particle system in general. (Your answer should contain five terms, three PEs 
but only two KEs, since the proton is considered fixed.) (c) Identify the values of all five terms and the 
total energy E long before the collision occurs, and again long after it is all over. What is the KE of the 
outgoing electron 1 once it is far away? Give your answers in terms of the variables T 2 , r, and r’. 



CHAPTER 


Oscillations 


Almost any system that is displaced from a position of stable equilibrium exhibits 
oscillations. If the displacement is small, the oscillations are almost always of the type 
called simple harmonic. Oscillations, and particularly simple harmonic oscillations, 
are therefore extremely widespread. They are also extremely useful. For example, 
all good clocks depend on an oscillator to regulate their time keeping: The first 
reliable clocks used a pendulum; the first accurate watches (historically crucial in 
navigation) used an oscillating balance wheel; modern watches use the oscillations 
of a quartz crystal; and today’s most accurate clocks, such as the atomic clock at 
the National Institute for Standards and Technology in Boulder, Colorado, use the 
oscillations of an atom. In this chapter, we shall explore the physics and mathematics 
of oscillations. I shall begin with simple harmonic oscillations and then go on to 
damped oscillations (oscillations that die out because of resistive forces) and driven 
oscillations (oscillations that are maintained by an outside driving force, as in all 
clocks). The last three sections of this chapter describe the use of Fourier series in 
finding the motion of an oscillator driven by an arbitrary periodic driving force. 


5.1 Hooke’s Law 


As you are certainly aware, a mass on the end of a spring that obeys Hooke’s law 
executes oscillations of the type that we call simple harmonic. Before we review the 
proof of this claim, let us first ask why Hooke’s law is so important and appears so 
frequently. Hooke’s law asserts that the force exerted by a spring has the form (for 
now we’ll restrict ourselves to a spring confined to the x axis) 

F x (x) = -kx (5.1) 

where x is the displacement of the spring from its equilibrium length and k is a positive 
number called the force constant. That k is positive means that the equilibrium at x — 0 
is stable: When x = 0 there is no force, when x > 0 (displacement to the right) the 161 



Chapter 5 Oscillations 


force is negative (back to the left), and when x < 0 (displacement to the left) the 
force is positive (back to the right); either way, the force is a restoring force, and the 
equilibrium is stable. (If k were negative, the force would be away from the origin, and 
the equilibrium would be unstable, in which case we do not expect to see oscillations.) 
An exactly equivalent way to state Hooke’s law is that the potential energy is 


U(x) = \kx 2 . 


Consider now an arbitrary conservative one-dimensional system which is specified 
by a coordinate x and has potential energy U (x). Suppose that the system has a stable 
equilibrium position x = x 0 , which we may as well take to be the origin (x 0 = 0). 
Now consider the behavior of U(x) in the vicinity of the equilibrium position. Since 
any reasonable function can be expanded in a Taylor series, we can safely write 

U(x) = U( 0) + U\0)x + \U"{ 0)x 2 + • • •. (5.2) 

As long as x remains small, the first three terms in this series should be a good 
approximation. The first term is a constant, and, since we can always subtract a 
constant from U (x) without affecting any physics, we may as well redefine U (0) 
to be zero. Because x = 0 is an equilibrium point, I/^O) = 0 and the second term 
in the series (5.2) is automatically zero. Because the equilibrium is stable, U'\ 0) is 
positive. Renaming U"( 0) as k, we conclude that for small displacements it is always 
a good approximation to take 1 

U(x) = \kx 2 . (5.3) 

That is, for sufficiently small displacements from stable equilibrium, Hooke’s law is 
always valid. Notice that if U”( 0) were negative, then k would also be negative, and the 
equilibrium would be unstable — a case we’re not interested in just now. Hooke’s law 
in the form (5.3) crops up in many situations, although it is certainly not necessary that 
the coordinate be the rectangular coordinate x, as the following example illustrates. 


example 5.1 The Cube Balanced on a Cylinder 

Consider again the cube of Example 4.7 (page 130) and show that for small 
angles 9 the potential energy takes the Hooke’s-law form U(9) — \k9 2 . 

We saw in that example that 

U(9) = mg[(r + b ) cos 9 + r0 sin 9]. 

If 9 is small we can make the approximations cos# « 1 — 9 2 / 2 and sin(9 ~ 9, 
so that 

U(9) & mg[(r + b)( 1 - \9 2 ) + r9 2 ] = mg(r + b) + \mg{r - b)9 2 . 


1 The only exception is if [/"(0) happens to be zero, but I shall not worry about this exceptional 
case here. 
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which, apart from the uninteresting constant, has the form \k9 2 with “spring 
constant” k = mg(r — b). Notice that the equilibrium is stable (k positive) only 
when r > b, a condition we had already found in Example 4.7. 



Figure 5.1 A mass m with potential energy U{x) = \kx 2 and total 
energy E oscillates between the two turning points at x = ±A, where 
U(x) = E and the kinetic energy is zero. 


As discussed in Section 4.6, the general features of the motion of any one¬ 
dimensional system can be understood from a graph of U(x) against x. For the 
Hooke’s-law potential energy (5.3), this graph is a parabola, as shown in Figure 
5.1. If a mass m has potential energy of this form and has any total energy E > 0, 
it is trapped and oscillates between the two turning points where U(x) = E, so that 
the kinetic energy is zero and the mass is instantaneously at rest. Because U (x) is 
symmetric about x = 0, the two turning points are equidistant on opposite sides of 
the origin and are traditionally denoted x = ±A, where A is called the amplitude of 
the oscillations. 


5.2 Simple Harmonic Motion 


We are now ready to examine the equation of motion (that is, Newton’s second 
law) for a mass m that is displaced from a position of stable equilibrium. To be 
definite, let us consider a cart on a frictionless track attached to a fixed spring as 
sketched in Figure 5.2. We have seen that we can approximate the potential energy 
by (5.3) or, equivalently, the force by F x (x ) = —kx. Thus the equation of motion is 
mx = F x = —kx or 

x - ——x = — o) 2 x (5.4) 

m 

where I have introduced the constant 



which we shall see is the angular frequency with which the cart will oscillate. Although 
we have arrived at Equation (5.4) in the context of a cart on a spring moving along 





164 Chapter 5 Oscillations 


'TYTT¥T¥T| 




x = 0 

(equilibrium) 


Figure 5.2 A cart of mass m oscillating on the end of a spring. 


the jc axis, we shall see eventually that it applies to many different oscillating systems 
in many different coordinate systems. For example, we have already seen in Equation 
(1.55) that the angle 0 that gives the position of a pendulum (or a skateboard in a 
half-pipe) is governed by the same equation, 0 = —&) 2 0, at least for small values 
of 0. In this section I am going to review the properties of the solutions of (5.4). 
Unfortunately, there are many different ways to write the same solution, all of which 
have their advantages, and you should be comfortable with them all. 


The Exponential Solutions 

Equation (5.4) is a second-order, linear, homogeneous differential equation 2 and so 
has two independent solutions. These two independent solutions can be chosen in 
several different ways, but perhaps the most convenient is this: 

x(t ) = e l(Dt and x(t) = e~ lM . 

As you can easily check, both of these functions do satisfy (5.4). Further, any constant 
multiple of either solution is also a solution, and likewise any sum of such multiples. 
Thus the function 


x(t) = C x e l(OT + C 2 e lb>t (5.5) 

is also a solution for any two constants C { and C 2 . (That any linear combination 
of solutions like this is itself a solution is called the superposition principle and 
plays a crucial role in many branches of physics.) Since this solution (5.5) contains 
two arbitrary constants, it is the general solution of our second-order equation (5.4). 3 
Therefore, any solution can be expressed in the form (5.5) by suitable choice of the 
coefficients Cj and C 2 . 


2 Linear because it contains no higher powers of x or its derivatives than the first power, and 
homogeneous because every term is a first power (that is, there is no term independent of x and its 
derivatives). 

3 Recall the result, discussed below Equation (1.56), that the general solution of a second-order 
differential equation contains precisely two arbitrary constants. 
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The Sine and Cosine Solutions 

The exponential functions in (5.5) are so convenient to handle that (5.5) is often the 
best form of the solution. Nevertheless, this form does have one disadvantage. We 
know, of course, that x(t) is real, whereas the two exponentials in (5.5) are complex. 
This means the coefficients C x and C 2 must be chosen carefully to ensure that x(t) 
itself is real. I shall return to this point shortly, but first I shall rewrite (5.5) in another 
useful way. From Euler’s formula (2.76) we know that the two exponentials in (5.5) 
can be written as 


e± i cot _ -j -1 sin(ct)t). 

Substituting into (5.5) and regrouping we find that 

x(t ) = (Cj + C 2 ) cos(cDt) + i(C x - C 2 ) sin(&>t) 

= B x cos {cot) + B 2 sin(&>r) (5.6) 

where B { and B 2 are simply new names for the coefficients in the previous line, 

C i T C 2 and B 2 — i{C x — C 2 ). (5.7) 

The form (5.6) can be taken as the definition of simple harmonic motion (or SHM): 
Any motion that is a combination of a sine and cosine of this form is called simple 
harmonic. Because the functions cos(wt) and sin(&>/) are real, the requirement that 
x(t) be real means simply that the coefficients B x and B 2 must be real. 

We can easily identify the coefficients B x and B 2 in terms of the initial conditions 
of the problem. Clearly at t = 0, (5.6) implies that x(0) = B x . That is, B x is just the 
initial position x(0) = x 0 . Similarly, by differentiating (5.6), we can identify coB 2 as 
the initial velocity v Q . 

If I start the oscillations by pulling the cart aside to x = x 0 and releasing it from 
rest ( v 0 = 0), then B 2 = 0 in (5.6) and only the cosine term survives, so that 

x(t) = x 0 cos(n)t). (5.8) 

If I launch the cart from the origin (x 0 = 0) by giving it a kick at t = 0, only the sine 
term survives, and 


x(0 = sin (ntf). 
to 

These two simple cases are illustrated in Figure 5.3. Notice that both solutions, like 
the general solution (5.6), are periodic because both the sine and cosine are. Since the 
argument of both sine and cosine is cot, the function x(t) repeats itself after the time 
t for which cor —2n. That is, the period is 



(5.9) 
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Figure 5.3 (a) Oscillations in which the cart is released from x 0 at t = 0 follow 
a cosine curve, (b) If the cart is kicked from the origin at t = 0, the oscillations 
follow a sine curve with initial slope v 0 . In either case the period of the oscilla¬ 
tions is r = 2 tz/co = 2 n-s/mjk and is the same whatever the values of x 0 or v 0 . 


The Phase-Shifted Cosine Solution 

The general solution (5.6) is harder to visualize than the two special cases of Fig¬ 
ure 5.3, and it can usefully be rewritten as follows: First, we define yet another 
constant 


A = yV + B 2 2 . (5.10) 

Notice that A is the hypotenuse to a right triangle whose other two sides are B x and 
B 2 .1 have indicated this in Figure 5.4, where I have also defined 8 as the lower angle 
of that triangle. We can now rewrite (5.6) as 

x(t) = A \ — cos (cot) + — sin(rnf) 

LA A 

= A[ cos 8 cos (cot) + sin 8 sin(mt)] 

= Acos(mt - <5). (5.11) 

From this form it is clear that the cart is oscillating with amplitude A, but instead of 
being a simple cosine as in (5.8), it is a cosine which is shifted in phase: When t = 0 
the argument of the cosine is -8, and the oscillations lag behind the simple cosine by 
the phase shift 8. We have derived the result (5.11) from Newton’s second law, but, as 
so often happens, one can derive the same result in more than one way. In particular. 



B i 


Figure 5.4 The constants A and 8 are defined in terms of B x and 
B 2 as shown. 
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(5.11) can also be derived using the energy approach discussed in Section 4.6. (See 
Problem 4.28.) 


Solution as the Real Part of a Complex Exponential 

There is still another useful way to write our solution, in terms of the complex 
exponentials of (5.5). The coefficients C x and C 2 there are related to the coefficients 
Bj and B 2 of the sine-cosine form by Equation (5.7), which we can solve to give 

C X = ±{B X - iB 2 ) and C 2 = \(B X +iB 2 ). (5.12) 

Since B x and B 2 are real, this shows that both C, and C 2 are generally complex and 
that C 2 is the complex conjugate of C h 

C 2 — Cf. 

(Recall that for any complex number z = x + iy, the complex conjugate z* is defined 
as 4 z* = x — iy.) Thus the solution (5.5) can be written as 

x(t) = C l e ia,t + C*e~ ia)t (5.13) 

where the whole second term on the right is just the complex conjugate of the first. 
(See Problem 5.35 if this isn’t clear to you.) Now, for any complex number z = x + iy, 

z + z* = (x + iy) + (x — iy) = 2x = 2 Re z 

where Rez denotes the real part of z (namely x). Thus (5.13) can be written as 

x(t) = 2 Re C x e l0)t . 

If we define a final constant C = 2C 1? we see from Equation (5.12) and Figure 5.4 
that 


C = B x -iB 2 = Ae~ iS 


(5.14) 


and 


x(t) = Re Ce icot = Re Ae i(cot ~ S) . 

This beautiful result is illustrated in Figure 5.5. The complex number Ae l(a>l ~ S) moves 
counterclockwise with angular velocity co around a circle of radius A. Its real part 
[namely x(f)] is the projection of the complex number onto the real axis. While the 
complex number goes around the circle, this projection oscillates back and forth on the 
x axis, with angular frequency co and amplitude A. Specifically, x(t) = A cos (cot — 8), 
in agreement with (5.11). 


4 While most physicists use the notation z*, mathematicians almost always use z for the complex 
conjugate of z. 
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Figure 5.5 The position x(t) of the oscillating cart is the real 
part of the complex number As the latter moves around 

the circle of radius A, the former oscillates back and forth on the 
x axis with amplitude A. 


example 5.2 A Bottle in a Bucket 

A bottle is floating upright in a large bucket of water as shown in Figure 5.6. In 
equilibrium it is submerged to a depth d 0 below the surface of the water. Show 
that if it is pushed down to a depth d and released, it will execute harmonic 
motion, and find the frequency of its oscillations. If d 0 = 20 cm, what is the 
period of the oscillations? 

The two forces on the bottle are its weight mg downward and the upward 
buoyant force QgAd, where q is the density of water and A is the cross-sectional 
area of the bottle. (Remember that Archidemes’ principle says that the buoyant 
force is Qg times the volume submerged, which is just Ad.) The equilibrium 
depth d 0 is determined by the condition 

mg = QgAd 0 . (5.15) 


A 



Figure 5.6 The bottle shown has been loaded with sand so that 
it floats upright in a bucket of water. Its equilibrium depth is 
d = d 0 . 
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Suppose now the bottle is at a depth d = d 0 + x. (This defines x as the distance 
from equilibrium, always the best coordinate to use.) Newton’s second law now 
reads 

rax = mg - QgA(d 0 + x). 

By (5.15), the first and second terms on the right cancel, and we’re left with 
x = —QgAx/m. But again by (5.15), QgA/m = g/d Q , so the equation of motion 
becomes 



which is exactly the equation for simple harmonic motion. We conclude that 
the bottle moves up and down in SHM with angular frequency to = y/g/d 0 . A 
remarkable feature of this result is that the frequency of oscillations is indepen¬ 
dent of ra, q, and A; also, the frequency is the same as that of a simple pendulum 
of length / = d 0 . If d 0 = 20 cm, then the period is 


r m 



1 0.20 m 
9.8 m/s 2 


= 0.9 sec. 


Try this experiment yourself! But be aware that the details of the flow of water 
around the bottle complicate the situation considerably. The calculation here is 
a very simplified version of the truth. 


Energy Considerations 

To conclude this section on simple harmonic motion, let us consider briefly the energy 
of the oscillator (the cart on a spring or whatever it is) as it oscillates back and forth. 
Since x(t) = A cos (cot — 8), the potential energy is just 

U = \kx 2 = ^k A 2 cos 2 (cot — 5). 

Differentiating x(t) to give the velocity, we find for the kinetic energy 
T = |rax 2 = jmco 2 A 2 sin 2 (cot — 8) 

— \kA 2 sin 2 (<wt — <$) 

where the second line results from replacing co 2 by k/m. We see that both U and T 
oscillate between 0 and \kA 2 , with their oscillations perfectly out of step — when U 
is maximum T is zero and vice versa. In particular, since sin 2 0 + cos 2 0 = 1, the total 
energy is constant, 


E = T + U = \kA 2 , 


(5.16) 


as it has to be for any conservative force. 
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5.3 Two-Dimensional Oscillators 


In two or three dimensions, the possibilities for oscillation are considerably richer 
than in one dimension. The simplest possibility is the so-called isotropic harmonic 
oscillator, for which the restoring force is proportional to the displacement from 
equilibrium, with the same constant of proportionality in all directions: 

F = —kr. (5.17) 

That is, F x = —kx, F y = —ky (and F z = —kz in three dimensions), all with the same 
constant k. This force is a central force directed toward the equilibrium position, which 
we may as well take to be the origin, as sketched in Figure 5.7(a). Figure 5.7(b) shows 
an arrangement of four identical springs that would produce a force of this form; it is 
easy to see that if the mass at the center is moved away from its equilibrium position it 
will experience a net inward force, and it is not too hard to show (Problem 5.19) that 
this inward force has the form (5.17) for small displacements r . 5 Another example of 
a two-dimensional isotropic oscillator is (at least approximately) a ball bearing rolling 
near the bottom of a large spherical bowl. Two important three-dimensional examples 
are an atom vibrating near its equilibrium in a symmetric crystal, and a proton (or 
neutron) as it moves inside a nucleus. 

Let us consider a particle that is subject to this type of force and suppose, for 
simplicity, that it is confined to two dimensions. The equation of motion, ? = F/m, 
separates into two independent equations: 



where I have introduced the familiar angular frequency co = -Jkjm (which is the same 
in both x and y equations because the same is true of the force constants). Each of 



(a) (b) 


Figure 5.7 (a) A restoring force that is proportional to r defines 

the isotropic harmonic oscillator, (b) The mass at the center of this 
arrangement of springs would experience a net force of the form 
F = —kr as it moves in the plane of the four springs. 


5 It is perhaps worth pointing out that one does not get a force of the form (5.17) by simply 
attaching a mass to a spring whose other end is anchored to the origin. 
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these two equations has exactly the form of the one-dimensional equation discussed 
in the last section, and the solutions are [as in (5.11)] 

x(t) = A x cos (cot - S x ) 1 
p(t) m A y cos(cot - 8 y ) j 

where the four constants A x , A y , 8 X , and 8 y are determined by the initial conditions 
of the problem. By redefining the origin of time, we can dispose of the phase shift 8 X , 
but, in general, we cannot also dispose of the corresponding phase in the y solution. 
Thus the simplest form for the general solution is 

x(t) = A*cos(a>r) 1 f52Q . 

y(t) = A y cos((ot-8) J 

where 5 = 8 y — 8 X is the relative phase of the y and x oscillations. (See Problem 5.15.) 

The behavior of the solution (5.20) depends on the values of the three constants 
A x , A y , and 8. If either A x or A y is zero, the particle executes simple harmonic motion 
along one of the axes. (The ball bearing in the bowl rolls back and forth through the 
origin, moving in the x direction only or the y direction only.) If neither A x nor A y 
is zero, the motion depends critically on the relative phase 8. If 8 = 0, then x(t) and 
y(t) rise and fall in step, and the point (x, y) moves back and forth on the slanting 
line that joins ( A x , A y ) to (—A x , —A y ), as shown in Figure 5.8(a). If 8 = n/2, then 
x and y oscillate out of step, with x at an extreme when y is zero, and vice versa; the 
point (x, y) describes an ellipse with semimajor and semiminor axes A x and A y , as in 
Figure 5.8(b). For other values of 8, the point (x, y) moves around a slanting ellipse, 
as shown for the case 8 = tt/4 in Figure 5.8(c). (For a proof that the path really is an 
ellipse, see Problem 8.11.) 

In the anisotropic oscillator, the components of the restoring force are propor¬ 
tional to the components of the displacement, but with different constants of propor¬ 
tionality: 


F x = —k x x, F y — —k y y, and F z = —k z z. 


(5.21) 


k,y 



(a) 8 = 0 (b) 8 = tt/2 (C) 8 - tt/4 


Figure 5.8 Motion of a two-dimensional isotropic oscillator as given by (5.20). (a) If 
5 = 0, then x and y execute simple harmonic motion in step, and the point (x, y) moves 
back and forth along a slanting line as shown, (b) If 8 = 7t/ 2, then (x, y) moves around 
an ellipse with axes along the x and y axes, (c) In general (for example, 8 = tt/4), the 
point (x, y) moves around a slanted ellipse as shown. 
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An example of such a force is the force felt by an atom displaced from its equilibrium 
position in a crystal of low symmetry, where it experiences different force constants 
along the different axes. For simplicity, I shall again consider a particle in two 
dimensions, for which Newton’s second law separates into two separate equations 
just as in (5.18): 


X = -CD*X 

y = -rfy. 


(5.22) 


The only difference between this and (5.18) is that there are now different frequencies 
for the different axes, co x = yjk x /m and so on. The solution of these two equations is 
just like (5.20): 


x(t) = A x cos(co x t ) 
f(t) = A y cos(co y t - 8 ). 

Because of the two different frequencies, there is a much richer variety of possible 
motions. If co x /co y is a rational number, it is fairly easy to see (Problem 5.17) that the 
motion is periodic, and the resulting path is called a Lissajous figure (after the French 
physicist Jules Lissajous, 1822-1880). For example, Figure 5.9(a) shows an orbit of 
a particle for which oo x /(o y = 2 and the x motion repeats itself twice as often as the 
y motion. In the case shown, the result is a figure eight. If ct) x /o) y is irrational, the 
motion is more complicated and never repeats itself. This case is illustrated, with 
o) x /co y = V2, in Figure 5.9(b). This kind of motion is called quasiperiodic: The 
motion of the separate coordinates x and y is periodic, but because the two periods 
are incompatible, the motion of r = (x, y) is not. 


(5.23) 


X 


(a) 0 ) X = 2(t)y (b) to x = yl2(Oy 

Figure 5.9 (a) One possible path for an anisotropic oscillator with co x = 2 
and u) y m 1. You can see that x goes back and forth twice in the time that y 
does so once, and the motion then repeats itself exactly, (b) A path for the 
case a> x = sfl and co y = 1 from t = 0 to t = 24. In this case the path never 
repeats itself, although, if we wait long enough, it will come arbitrarily close 
to any point in the rectangle bounded by x — ±A X and y = ±A y . 
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5.4 Damped Oscillations 


I shall now return to the one-dimensional oscillator, and take up the possibility that 
there are resistive forces that will damp the oscillations. There are several possibilities 
for the resistive force. Ordinary sliding friction is approximately constant in magni¬ 
tude, but always directed opposite to the velocity. The resistance offered by a fluid, 
such as air or water, depends on the velocity in a complicated way. However, as we 
saw in Chapter 2, it is sometimes a reasonable approximation to assume that the re¬ 
sistive force is proportional to v or (under different circumstances) to v 2 . Here I shall 
assume that the resistive force is proportional to v; specifically, f = —b\. One of my 
main reasons is that this case leads to an especially simple equation to solve, and the 
equation is itself a very important equation that appears in several other contexts and 
is therefore well worth studying. 6 

Consider, then, an object in one dimension, such as a cart attached to a spring, that 
is subject to a Hooke’s law force, — kx, and a resistive force, —bx. The net force on 
the object is —bx — kx, and Newton’s second law reads (if I move the two force terms 
over to the left side) 


mx + bx + kx — 0. (5.24) 

One of the beautiful things about physics is the way the same mathematical 
equation can arise in totally different physical contexts, so that our understanding of 
the equation in one situation carries over immediately to the other. Before we set about 
solving Equation (5.24), I would like to show how the same equation appears in the 
study of LRC circuits. An LRC circuit is a circuit containing an inductor (inductance 
L), a capacitor (capacitance C), and a resistor (resistance R), as sketched in Figure 
5.10.1 have chosen the positive direction for the current to be counterclockwise, and 
the charge q(t) to be the charge on the left-hand plate of the capacitor [with —q(t) on 
the right], so that I(t) =q ( t ). If we follow around the circuit in the positive direction, 
the electric potential drops by LI = Lq across the inductor, by RI = Rq across the 
resistor, and by q/C across the capacitor. Applying Kirchoff’s second rule for circuits, 
we conclude that 


Lq + Rq + ^q= 0. (5.25) 

This has exactly the form of Equation (5.24) for the damped oscillator, and anything 
that we learn about the equation for the oscillator will be immediately applicable to 
the LRC circuit. Notice that the inductance L of the electric circuit plays the role of 
the mass of the oscillator, the resistance term Rq corresponds to the resistive force, 
and 1/C to the spring constant k. 


6 You should be aware, however, that although the case I am considering — that the resistive 
force f is linear in v — is very important, it is nevertheless a very special case. I shall describe some 
of the startling complications that can occur when f is not linear in v in Chapter 12 on nonlinear 
mechanics and chaos. 
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L 



C 

Figure 5.10 An LRC circuit. 


Let us now return to mechanics and the differential equation (5.24). To solve this 
equation it is convenient to divide by m and then introduce two other constants. I shall 
rename the constant b/m as 2/3, 

- = 2/3. (5.26) 

m 

This parameter f, which can be called the damping constant, is simply a convenient 
way to characterize the strength of the damping force — as with b, large f3 corresponds 
to a large damping force and conversely. I shall rename the constant k/m as ro 2 , 
that is, 

(5.27) 

Notice that co 0 is precisely what I was calling co in the previous two sections. I have 
added the subscript because, once we admit resistive forces, various other frequencies 
become important. From now on, I shall use the notation co Q to denote the system’s 
natural frequency, the frequency at which it would oscillate if there were no resistive 
force present, as given by (5.27). With these notations, the equation of motion (5.24) 
for the damped oscillator becomes 



x + 2$i + o.)qX = 0. 


(5.28) 


Notice that both of the parameters (3 and co Q have the dimensions of inverse time, that 
is, frequency. 

Equation (5.28) is another second-order, linear, homogeneous equation [the last 
was (5.4)]. Therefore, if by any means we can spot two independent 7 solutions, xft) 
and x 2 (t) say, then any solution must have the form C^xft) + C 2 x 2 (f). What this 


7 It is about time I gave you a definition of “independent.” In general this is a little complicated, 
but for two functions it is easy: Two functions are independent if neither is a constant multiple of 
the other. Thus the two functions sin(x) and cos(x) are independent; likewise the two functions x 
and x 2 \ but the two functions x and 3x are not. 
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means is that we are free to play a game of inspired guessing to find ourselves two 
independent solutions; if by hook or by crook we can spot two solutions, then we have 
the general solution. 

In particular, there is nothing to stop us trying to find a solution of the form 


for which 


x(t) = e rt 


x = re rt 


(5.29) 


and 


Substituting into (5.28) we see that our guess (5.29) satisfies (5.28) if and only if 

r 2 + 2 ftr + (o 2 = 0 (5.30) 


[an equation sometimes called the auxiliary equation for the differential equation 
(5.28)]. The solutions of this equation are, of course, r - —ft ± ft 2 — co 2 . Thus if 
we define the two constants 


ri = ~P + y/p 2 ~ M o 

r 2 = -P~ JP 2 - co 2 


(5.31) 


then the two functions e r '‘ and e nt are two independent solutions of (5.28) and the 
general solution is 

x(t) = C 1 e r ' t + C 2 e rit (5.32) 

= e~* + C 2 e“^ 2 -^o 2 ^ . (5.33) 

This solution is rather too messy to be especially illuminating, but, by examining 
various ranges of the damping constant ft, we can begin to see what (5.33) entails. 


Undamped Oscillation 

If there is no damping then the damping constant ft is zero, the square root in the 
exponents of (5.33) is just ico 0 , and our solution reduces to 

x(t) = C/" 0 ' + C 2 e- ic0ot , (5.34) 

the familiar solution for the undamped harmonic oscillator. 

Weak Damping 

Suppose next that the damping constant ft is small. Specifically, suppose that 


ft <co 0 . 


( 5 . 35 ) 
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a condition sometimes called underdamping. In this case, the square root in the 
exponents of (5.33) is again imaginary, and we can write 

ft 2 — co 2 — iyjco 2 — ft 2 = i(D\, 

where 

= jco 2 - ft 2 . (5.36) 

The parameter oo ] is a frequency, which is less than the natural frequency co 0 . In the 
important case of very weak damping (ft <$C co 0 ), is very close to co Q . With this 
notation, the solution (5.33) becomes 

x(t) = e~ pt (c/ 0 * + C 2 e~ ia} ^ . (5.37) 

This solution is the product of two factors: The first, e~P*, is a decaying exponen¬ 
tial, which steadily decreases toward zero. The second factor has exactly the form 
(5.34) of undamped oscillations, except that the natural frequency co a is replaced by 
the somewhat lower frequency co x . We can rewrite the second factor, as in Equation 
(5.11), in the form A cos (co^ — 8) and our solution becomes 

x(t) = Ae~^ cos(&qr — 5). (5.38) 

This solution clearly describes simple harmonic motion of frequency co 1 with an 
exponentially decreasing amplitude Ae~P‘, as shown in Figure 5.11. The result (5.38) 
suggests another interpretation of the damping constant ft. Since ft has the dimensions 
of inverse time, l/ft is a time, and we now see that it is the time in which the 
amplitude function Ae~P‘ falls to 1 /e of its initial value. Thus, at least for underdamped 
oscillations, ft can be seen as the decay parameter, a measure of the rate at which the 
motion dies out, 


(decay parameter) = ft [underdamped motion]. 



Figure 5.11 Underdamped oscillations can be thought of as sim¬ 
ple harmonic oscillations with an exponentially decreasing amplitude 
Ae~P‘. The dashed curves are the envelopes, ±Ae~^. 
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The larger p the more rapidly the oscillations die out, at least for the case ft < co 0 that 
we are discussing here. 


Strong Damping 

Suppose instead that the damping constant P is large. Specifically suppose that 

p > co 0 , (5.39) 

a condition sometimes called overdamping. In this case, the square root in the 
exponents of (5.33) is real and our solution is 

x(t) = + C 2 e-( p+ ^ ^ K (5.40) 

Here we have two real exponential functions, both of which decrease as time goes by 
(since the coefficients of t in both exponents are negative). In this case, the motion 
is so damped that it completes no bona fide oscillations. Figure 5.12 shows a typical 
case in which the oscillator was given a kick from O at t = 0; it slid out to a maximum 
displacement and then slid ever more slowly back again, returning to the origin only 
in the limit that t —> oo. The first term on the right of (5.40) decreases more slowly 
than the second, since the coefficient in its exponent is the smaller of the two. Thus 
the long-term motion is dominated by this first term. In particular, the rate at which 
the motion dies out can be characterized by the coefficient in the first exponent, 

(decay parameter) = P — yj P 2 — co 2 [overdamped motion]. (5.41) 

Careful inspection of (5.41) shows that — contrary to what one might expect — the 
rate of decay of overdamped motion gets smaller if the damping constant p is made 
bigger. (See Problem 5.20.) 


,x(t) 



O 


Figure 5.12 Overdamped motion in which the oscillator is kicked 
from the origin at t = 0. It moves out to a maximum displacement and 
then moves back toward O asymptotically as t oo. 


Critical Damping 

The boundary between underdamping and overdamping is called critical damping 
and occurs when the damping constant is equal to the natural frequency, p =co 0 . This 
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case has some interesting features, especially from a mathematical point of view. 
When P = co 0 the two solutions that we found in (5.33) are the same solution, namely 

x(t) = e~ pt . (5.42) 

[This happened because the two solutions of the auxiliary equation (5.30) happen 
to coincide when ft = w 0 .] This is the one case where our inspired guess, to seek a 
solution of the form x (t) — e rt , fails to find us two solutions of the equation of motion, 
and we have to find a second solution by some other method. Fortunately, in this case, 
it is not hard to spot a second solution: As you can easily check, the function 

jc(f) = te~ m (5.43) 

is also a solution of the equation of motion (5.28) in the special case that ft — co 0 . (See 
Problems 5.21 and 5.24.) Thus the general solution for the case of critical damping is 

x(t) = C l e~ fit + C 2 14T pt . (5.44) 

Notice that both terms contain the same exponential factor e~^‘. Since this factor is 
what dominates the decay of the oscillations as t oo, we can say that both terms 
decay at about the same rate, with decay parameter 

(decay parameter) = ft = co 0 [critical damping]. 

It is interesting to compare the rates at which the various types of damped oscil¬ 
lation die out. We have seen that in each case, this rate is determined by a “decay 
parameter,” which is just the coefficient of t in the exponent of the dominant expo¬ 
nential factor in x(t). Our findings can be summarized as follows: 

damping ft decay parameter 

none ft = 0 0 

under ft < co 0 ft 

critical ft = co a ft 

over P > co 0 P — ^jp 2 - co 2 

Figure 5.13 is a plot of the decay parameter as a function of P and shows clearly that 
the motion dies out most quickly when p — oo Q \ that is, when the damping is critical. 
There are situations where one wants any oscillations to die out as quickly as possible. 
For example, one wants the needle of an analog meter (a voltmeter or pressure gauge, 
for instance) to settle down rapidly on the correct reading. Similarly, in a car, one 
wants the oscillations caused by a bumpy road to decay quickly. In such cases one 
must arrange for the oscillations to be damped (by the shock absorbers in a car), and 
for the quickest results the damping should be reasonably close to critical. 
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,decay 


parameter 


a 

>0 


Figure 5.13 The decay parameter for damped oscillations as 
a function of the damping constant ji. The decay parameter 
is biggest, and the motion dies out most quickly, for critical 
damping, with fi - co 0 . 


5.5 Driven Damped Oscillations 


Any natural oscillator, left to itself, eventually comes to rest, as the inevitable damping 
forces drain its energy. Thus if one wants the oscillations to continue, one must arrange 
for some external “driving” force to maintain them. For example, the motion of the 
pendulum in a grandfather clock is driven by periodic pushes caused by the clock’s 
weights; the motion of a young child on a swing is maintained by periodic pushes 
from a parent. If we denote the external driving force by F(t) and if we assume as 
before that the damping force has the form —bv, then the net force on the oscillator 
is —bv — kx + F(t ) and the equation of motion can be written as 

mx + bx + kx = F(t). (5.45) 

Like its counterpart for undriven oscillations, this differential equation crops up in 
several other areas of physics. A prominent example is the LRC circuit of Figure 5.10. 
If we want the oscillating current in that circuit to persist, we must apply a driving 
EMF, £(f), in which case the equation of motion for the circuit becomes 

Lq + Rq + ^q = E(t) (5.46) 

in perfect correspondence with (5.45). 

As before, we can tidy Equation (5.45) if we divide the equation by m and replace 
b/m by and k/m by cu o 2 . In addition, I shall denote F(t)/m by 

/(f) = — . (5.47) 

m 

the force per unit mass. With this notation, (5.45) becomes 


x + 2@x + <a~x = f(t). 


(5.48) 
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Linear Differential Operators 

Before we discuss how to solve this equation, I would like to streamline our notation. 
It turns out that it is very helpful to think of the left side of (5.48) as the result of a 
certain operator acting on the function x(t). Specifically, we define the differential 
operator 

D= +2P— + co 2 . (5.49) 

dt 2 dt 0 

The meaning of this definition is simply that when D acts on x it gives the left side 
of (5.48): 


Dx = x + 2j6i + (OqX. 

This definition is obviously a notational convenience — the equation (5.48) becomes 
just 


Dx = f (5.50) 

— but it is much more: The notion of an operator like (5.49) proves to be a powerful 
mathematical tool, with applications throughout physics. For the moment the impor¬ 
tant thing is that the operator is linear. We know from elementary calculus that the 
derivative of ax (where a is a constant) is just ax and that the derivative of x } + x 2 is 
just ij + x 2 . Since this also applies to second derivatives, it applies to the operator D: 

D(ax) = aDx and D{x x + x 2 ) = Dx\ + Dx 2 . 

(Make sure you understand what these two equations mean.) We can combine these 
into a single equation: 


D(ax x + bx 2 ) = aDx x + bDx 2 (5.51) 

for any two constants a and b and any two functions x t (r) and x 2 {t). Any operator 
that satisfies this equation is called a linear operator. 

We have actually used the property (5.51) of linear operators before. The equation 
(5.28) for a damped oscillator (not driven) can be written as 


Dx = 0. (5.52) 

The superposition principle asserts that if x x and x 2 are solutions of this equation, then 
so is ax j + bx 2 for any constants a and b. In our new operator notation, the proof is 
very simple: We are given that Dx x — 0 and Dx 2 = 0, and using (5.51) it immediately 
follows that 


D(axi + bx 2 ) = aDx x + bDx 2 = 0 + 0 = 0; 
that is, ax 1 + bx 2 is also a solution. 

The equation (5.52), Dx = 0, for the undriven oscillator is called a homogeneous 
equation, since every term involves either x or one of its derivatives exactly once. The 
equation (5.50), Dx = /, is called an inhomogeneous equation, since it contains the 
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inhomogeneous term /, which does not involve x at all. Our job now is to solve this 
inhomogeneous equation. 


Particular and Homogeneous Solutions 

Using our new operator notation, we can find the general solution of Equation (5.48) 
surprisingly easily; in fact, we’ve already done most of the work. Suppose first that we 
have somehow spotted a solution; that is, we’ve found a function x p (r) that satisfies 

Dx p = /. (5.53) 

We call this function x p (t) a particular solution of the equation, and the subscript “p” 
stands for “particular.” Next let us consider for a moment the homogeneous equation 
Dx = 0 and suppose we have a solution x h (t), satisfying 


Dx h = 0. (5.54) 

We’ll call this function a homogeneous solution. 8 We already know all about the 
solutions of the homogeneous equation, and we know from (5.32) that x h must have 
the form 


*h(0 = + C 2 e rit , 


(5.55) 


where both exponentials die out as t oo. 

We’re now ready to prove the crucial result. First, if x p is a particular solution 
satisfying (5.53), then x p + x h is another solution, for 

D(x p + x h ) = Dx p + Dx h = f + 0 = /. 

Given the one particular solution x p , this gives us a large number of other solutions 
x p + x h . And we have in fact found all the solutions, since the function x h contains 
two arbitrary constants, and we know that the general solution of any second-order 
equation contains exactly two arbitrary constants. Therefore x p + x h , with x h given 
by (5.55) is the general solution. 

This result means that all we have to do is somehow to find a single particular 
solution Xp(t) of the equation of motion (5.48), and we shall have every solution in 
the form x(t) = x p (t) + x h (t}. 


Complex Solutions for a Sinusoidal Driving Force 

I shall now specialize to the case that the driving force f(t ) is a sinusoidal function 
of time, 


f(t) = f 0 cos(a)t), (5.56) 


8 Another common name is the complementary function. This has the disadvantage that it’s hard 
to remember which is “particular” and which “complementary.” 
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where f 0 denotes the amplitude of the driving force [actually the amplitude divided 
by the oscillator’s mass, since fit) — F(t)/m] and co is the angular frequency of 
the driving force. (Be careful to distinguish the driving frequency co from the natural 
frequency co 0 of the oscillator. These are entirely independent frequencies, although 
we shall see that the oscillator responds most when co ^ co Q .) The driving force for 
many driven oscillators is at least approximately sinusoidal. For example, even the 
parent pushing the child on a swing can be crudely approximated by (5.56); the driving 
EMF induced in your radio’s circuits by a broadcast signal is almost perfectly of this 
form. Probably the chief importance of sinusoidal driving forces is that, according 
to Fourier’s theorem, 9 essentially any driving force can be built up as a series of 
sinusoidal forces. 

Let us therefore assume the driving force is given by (5.56), so that the equation 
of motion (5.48) takes the form 

x + 2fx + co° x = f 0 cos (cot). (5.57) 

Solving this equation is greatly simplified by the following trick: For any solution of 
(5.57), there must be a solution of the same equation, but with the cosine on the right 
replaced by a sine function. (After all, these two differ only by a shift in the origin of 
time.) Accordingly, there must also be a function y(t) that satisfies 

y + 2 fy + co^y = f 0 sin(cnt). (5.58) 

Suppose now we define the complex function 

z(t) = x(t) + iy(t), (5.59) 

with x{t) as its real part and y(t) as its imaginary part. If we multiply (5.58) by i and 
add it to (5.57), we find that 


z + 2 fiz + co^z = f Q e lwt . (5.60) 

Although it may not yet look it, Equation (5.60) is a tremendous advance. Because 
of the simple properties of the exponential function, (5.60) is much easier to solve 
than either (5.57) or (5.58). And as soon as we find a solution z(t) of (5.60), we have 
only to take its real part to have a solution of the equation (5.57) whose solutions we 
actually want. 

In seeking a solution of (5.60), we are obviously free to try any function we like. 
In particular, let’s see if there is a solution of the form 


z(t) = Ce icot , (5.61) 

where C is an as yet undetermined constant. If we substitute this guess into the left 
side of (5.60), we get 


{—co 1 + 2ij5co + co^)Ce lC0t = f 0 e lcot . 


9 Named for the French mathematician, the Baron Jean Baptiste Joseph Fourier, 1768-1830. See 
Sections 5.7-5.9. 
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In other words, our guess (5.61) is a solution of (5.60) if and only if 


C = 


fo 

co 2 — co 2 + 2ifico ’ 


(5.62) 


and we have succeeded in finding a particular solution of the equation of motion. 

Before we take the real part of z(t) = Ce ltot , it is convenient to write the complex 
coefficient C in the form 

C = Ae~ iS , (5.63) 


where A and 8 are real. [Any complex number can be written in this form; the particular 
notation is chosen to match (5.14).] To identify A and 8, we must compare (5.62) and 
(5.63). First, taking the absolute value squared of both equations we find that 

A 2 = CC * = fo _ fo 

co 2 — co 2 + 2 ifico a) 2 — co 2 — 2 ifico 

or 

f 2 

A 2 __ ±o _ (5.64) 

(co 2 — co 2 ) 2 + 4 p 2 co 2 

(Make sure you understand this derivation. See Problem 5.35 for some guidance.) 
We are going to see in a moment that A is just the amplitude of the oscillations 
caused by the driving force f(t). Thus the result (5.64) is the most important result 
of this discussion. It shows how the amplitude of oscillations depends on the various 
parameters. In particular, we see that the amplitude is biggest when co 0 ~ co, so that 
the denominator is small; in other words, the oscillator responds best when driven at 
a frequency co that is close to its natural frequency co 0 , as you would probably have 
guessed. 

Before we continue to discuss the properties of our solution, we need to identify 
the phase angle 8 in (5.63). Comparing (5.63) and (5.62) and rearranging, we see that 

f Q e ,s = A(co 2 — co 2 + 2 ifico). 


Since f 0 and A are real, this says that the phase angle 8 is the same as the phase 
angle of the complex number (co 2 — co 2 ) + lifico. This relationship is illustrated in 
Figure 5.14, from which we conclude that 


5 = arctan 



(5.65) 


Our quest for a particular solution is now complete. The “fictitious” complex 
solution introduced in (5.59) is 


z(t) = Ce iwt = Ae i(cot - S) 


and the real part of this is the solution we are seeking, 

x(t) = A cos (cot - 8) (5.66) 

where the real constants A and 8 are given by (5.64) and (5.65). 
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2/3o) 


Figure 5.14 The phase angle <5 is the angle of this triangle. 


The solution (5.66) is just one particular solution of the equation of motion. The 
general solution is found by adding any solution of the corresponding homogeneous 
equation, as given by (5.55); that is, the general solution is 

x(t) = A cos (cot - 8) + C x e r * + C 2 e rit . (5.67) 

Because the two extra terms in this general solution both die out exponentially as time 
passes, they are called transients. They depend on the initial conditions of the problem 
but are eventually irrelevant: The long-term behavior of our solution is dominated by 
the cosine term. Thus the particular solution (5.66) is the solution with which we are 
usually concerned, and we shall explore its properties in the next section. 

Before we discuss an example of the motion (5.67), it is important that you be 
very clear as to the type of system to which (5.67) applies, namely, any oscillator 
for which both the restoring force ( —kx ) and the resistive force (—bx) are linear — 
a driven, damped linear oscillator, whose equation of motion (5.45) is a linear 
differential equation. Because nonlinear differential equations are often hard to solve, 
most mechanics texts have until recently focussed on linear equations. This created 
the false impression that linear equations were in some sense the norm, and that the 
solution (5.67) was the only (or, at least, the only important) way for an oscillator to 
behave. As we shall see in Chapter 12 on nonlinear mechanics and chaos, an oscillator 
whose equation of motion is nonlinear can behave in ways that are astonishingly 
different from (5.67). One important reason for studying the linear oscillator here 
is to give you a backdrop against which to study the nonlinear oscillator later on. 

The details of the motion (5.67) depend on the strength of the damping param¬ 
eter f$. To be specific, let us assume that our oscillator is weakly damped, with (3 less 
than the natural frequency co 0 (underdamping). In this case, we know that the two 
transient terms of (5.67) can be rewritten as in (5.38), so that 


x(t) = A co $(a>t - 8) + A n e~P* cos(e>j t - 5 tr ). (5.68) 


You need to think very carefully about this potentially confusing formula. The second 
term on the right is the homogeneous or transient term, and I have added the subscript 
“tr” to distinguish the constants A tr and <$ u . from the A and 8 of the first term. The 
two constants A tr and c5 tr are arbitrary constants; (5.68) is a possible motion of our 
system for any values of A tt and 5 lr , which are determined by the initial conditions. 
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The factor makes clear that this transient term decays exponentially and is indeed 

irrelevant to the long-term behavior. As it decays, the transient term oscillates with 
the angular frequency co 1 of the undriven (but still damped) oscillator, as in (5.36). 
The first term is our particular solution and its two constants A and 8 are certainly 
not arbitrary; they are determined by (5.64) and (5.65) in terms of the parameters of 
the system. This term oscillates with the frequency co of the driving force and with 
unchanging amplitude, for as long as the driving force is maintained. 

example 5.3 Graphing a Driven Damped Linear Oscillator 

Make a plot of x (?) as given by (5.68) for a driven damped linear oscillator which | 
is released from rest at the origin at time t = 0, with the following parameters: 
Drive freqency cd = 2jt, natural frequency co 0 = 5co, decay constant = co 0 / 20, I 
and driving amplitude f 0 = 1000. Show the first five drive cycles. 

The choice of drive frequency equal to 2n means that the drive period is j 
x — 2itIw = 1; this means simply that we have chosen to measure time in units 
of the drive period — often a convenient choice. That co 0 = 5 co means that our 
oscillator has a natural frequency five times the drive frequency; this will let 
us distinguish easily between the two on a graph. That — <w o /20 means that 
the oscillator is rather weakly damped. Finally the choice of f 0 $= 1000 is just 
a choice of our unit of force; the reason for this odd-seeming choice is that it 
leads to a conveniently sized amplitude of oscillation (namely A close to 1). 

Our first task is to determine the various constants in (5.68) in terms of the 
given parameters. In fact this is easier if we rewrite the transient term of (5.68) j 
in the “cosine plus sine” form, so that 

x(t) = Acos(cot — 8) + ^ t [B l cos((o l t) + 5 2 sin(&>i?)]. (5.69) 

The constants A and 8 are determined by (5.64) and (5.65), which, for the given 
parmeters, yield j 

A = 1.06 and <5 = 0.0208. 

The frequency co 1 is 

o>{ = - P 2 = 9.987JZ-. 

which is very close to co Q , as we would expect for a weakly damped oscillator, j 
To find B x and B 2 , we must equate x(0) as given by (5.69) to its given initial 
value x 0 , and likewise the corresponding expression for i 0 to the initial value v 0 . 
This gives two simultaneous equations for B x and B 2 , which are easily solved j 
(Problem 5.33) to give j 

B l = x 0 — Acos<$ and B 2 — — (u 0 — <uAsin<$ + fiB{) (5.70) I 
"i I 

or, with the numbers, including the initial conditions x Q = v 0 = 0, 


B x = -1.05 and B 2 = -0.0572. 



186 Chapter 5 Oscillations 



(b) 


Figure 5.15 The response of a damped, linear oscillator to a si¬ 
nusoidal driving force, with the time t shown in units of the drive 
period, (a) The driving force is a pure cosine as a function of time, 
(b) The resulting motion for the initial conditons x 0 = v Q = 0. For 
the first two or three drive cycles, the transient motion is clearly vis¬ 
ible, but after that only the long-term motion remains, oscillating 
sinusoidally at exactly the drive frequency. As explained in the text, 
the sinusoidal motion after t ^ 3 is called an attractor. 


Putting all of these numbers into (5.69), we can now plot the motion, as 
shown in Figure 5.15, where part (a) shows the driving force f{t) = f 0 cos (cot) 
and part (b) the resulting motion x(t ) of the oscillator. The driving force is, of 
course, perfectly sinusoidal with period 1. The resulting motion is much more 
interesting. After about three drive cycles (t ^ 3), the motion is indistinguishable 
from a pure cosine, oscillating at exactly the drive frequency; that is, the 
transients have died out and only the long-term motion remains. Before t & 3, 
however, the effects of the transients are clearly visible. Since they oscillate at 
the faster natural frequency co 0 , they show up as a rapid succession of bumps 
and dips. In fact, you can easily see that there are five such bumps within the 
first drive cycle, indicating that co 0 = 5co. 


Because the transient motion depends on the initial values x 0 and v 0 , different 
values of x 0 and v 0 would lead to quite different initial motion. (See Problem 5.36.) 
After a short time however (a couple of cycles in this example), the initial differences 
disappear and the motion settles down to the same sinusoidal motion of the particular 
solution (5.66), irrespective of the initial conditions. For this reason, the motion 
of (5.66) is sometimes called an attractor — the motions corresponding to several 
different initial conditions are “attracted” to the particular motion (5.66). For the linear 
oscillator discussed here, there is a unique attractor (for a given driving force): Every 
possible motion of the system, whatever its initial conditions, is attracted to the same 
motion (5.66). We shall see in Chapter 12 that for nonlinear oscillators there can be 
several different attractors and that for some values of the parameters the motion of 
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an attractor can be far more complicated than simple harmonic oscillation at the drive 
frequency. 

The amplitude and phase of the attractor seen in Figure 5.15(b) depend on the 
parameters of the driving force (but not, of course, on the initial conditions). The 
dependence of the amplitude and phase on these parameters is the subject of the next 
section. 


5.6 Resonance 


In the previous section we considered a damped oscillator that is being driven by 
a sinusoidal driving force (actually force divided by mass) f(t ) = f 0 cos(cot) with 
angular frequency co. We saw that, apart from transient motions that die out quickly, 
the system’s response is to oscillate sinusoidally at the same frequency, co: 

x(t) = A cos (cot — 8), 
with amplitude A given by (5.64), 

f 2 

a 2 _ _£°_ (5 71 ) 

(co 2 - CO 2 ) 2 + 4/3 2 co 2 ’ 

and phase shift 8 given by (5.65). 

The most obvious feature of (5.71) is that the amplitude A of the response is 
proportional to the amplitude of the driving force, A a f Q , a result you might have 
guessed. More interesting is the dependence of A on the frequencies co 0 (the natural 
frequency of the oscillator) and co (the frequency of the driver), and on the damping 
constant jB. The most interesting case is when the damping constant fi is very small, 
and this is the case I shall discuss. With small, the second term in the denominator of 
(5.71) is small. If co 0 and co are very different, then the first term in the denominator of 
(5.71) is large, and the amplitude of the driven oscillations is small. On the other hand, 
if co 0 is very close to co, both terms in the denominator are small, and the amplitude 
A is large. This means that if we vary either co 0 or co, there can be quite dramatic 
changes in the amplitude of the oscillator’s motion. This is illustrated in Figure 5.16, 
which shows A 2 as a function of co 0 with co fixed, for a rather weakly damped system 
(/? = O.W). (Note that, because the energy of the system is proportional to A 2 , it is 
usual to make plots of A 2 rather than A.) 

Although the behavior illustrated in Figure 5.16 is startlingly dramatic, the qualita¬ 
tive features are what you might have expected. Left to its own devices, the oscillator 
vibrates at its natural frequency co 0 (or at the slightly lower frequency co l if we allow 
for the damping). If we try to force it to vibrate at a frequency co, then for values of 
co close to co Q the oscillator responds very well, but if co is far from co Q , it hardly re¬ 
sponds at all. We refer to this phenomenon — the dramatically greater response of an 
oscillator when driven at the right frequency — as resonance. 

An everyday application of resonance is the reception of radio signals by the LRC 
circuit in your radio. As we saw, the equation of motion of an LRC circuit is exactly 
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Figure 5.16 The amplitude squared, A 2 , of a driven oscillator, 
shown as a function of the natural frequency o> 0 , with the driving 
frequency co fixed. The response is dramatically largest when co a 
and co are close. 


the same as that of a driven oscillator, and LRC circuits show the same phenomenon 
of resonance. When you tune your radio to receive station KVOD at 90.1 MHz, you 
are adjusting an LRC circuit in the radio so that its natural frequency is 90.1 MHz. 
The many radio stations in your neighborhood are all sending out signals, each at its 
own frequency and each inducing a tiny EMF in the circuit of your radio. But only 
the signal with the right frequency actually succeeds in driving an appreciable current, 
which mimics the signal sent out by your favorite KVOD and reproduces its broadcast 
sounds. 

An example of a mechanical resonance of the kind discussed here is the behavior 
of a car driving on a “washboard” road that has worn into a series of regularly spaced 
bumps. Each time a wheel crosses a bump, it is given an upward impulse, and the 
frequency of these impulses depends on the car’s speed. There is one speed at which 
the frequency of these impulses equals the natural frequency of the wheels’ vibration 
on the springs 10 and the wheels resonate, causing an uncomfortable ride. If the car 
drives slower or faster than this speed, it goes “off resonance” and the ride is much 
smoother. 

Another example occurs when a platoon of soldiers marches across a bridge. A 
bridge, like almost any mechanical system, has certain natural frequencies, and if the 
soldiers happened to march with a frequency equal to one of these natural frequencies, 
the bridge could conceivably resonate sufficiently violently to break. For this reason, 
soldiers break step when marching across a bridge. 

The details of the resonance phenomenon are a bit complicated. For example, the 
exact location of the maximum response depends on whether we vary co 0 with co fixed 
or vice versa. The amplitude A is a maximum when the denominator, 

denominator = ( co 2 — co 2 ) 2 + Ap 2 co 2 (5.72) 


10 It is the wheels (plus axles) that exhibit the resonant oscillations; the much heavier body of 
the car is relatively unaffected. 
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of (5.71) is a minimum. If we vary co Q with co fixed (as when you tune a radio to pick 
up your favorite station) this minimum obviously occurs when co Q = co, making the 
first term zero. On the other hand, if we vary co with co 0 fixed (which is what happens 
in many applications), then the second term in (5.72) also varies and a straightforward 
differentiation shows that the maximum occurs when 



However, when <£co 0 (as is usually the most interesting case), the difference 
between (5.73) and co = co 0 is negligible. 

We have met so many different frequencies in this chapter that it may be worth 
pausing to review them. First there is co 0 the natural frequency of the oscillator (in 
the absence of any damping). Next when we added in a little d amping, w e found 
that the same system oscillated sinusoidally with frequency co x = -Jco 2 — fi 2 under an 
exponentially decaying envelope. Then we added a driving force with frequency co, 
which can, in principle, take on any value independently of the previous two. However, 
the response of the driven oscillator is biggest when co ~ co 0 ; specifically, if we vary 
co with co 0 fixed, the maximum response comes when co — co 2 as defined by (5.73). To 
summarize: 


co 0 = sfkjm = natural frequency of undamped oscillator 
co x = sjco^ — yS 2 = frequency of damped oscillator 
co — frequency of driving force 

co 2 = yjco 2 — 2fi 2 = value of co at which response is maximum. 

In any case, the maximum amplitude of the driven oscillations is found by putting 
co 0 H co in (5.71), to give 



(5.74) 


This shows that smaller values of the damping constant lead to larger values of the 
maximum amplitude of oscillation, as illustrated in Figure 5.17. 11 


Width of the Resonance; the Q Factor 

You can see clearly from Figure 5.17 that if we make the damping constant smaller, 
the resonance peak not only gets higher, but also gets narrower. We can make this idea 
more precise by defining the width (or full width at half maximum or FWHM) as 
the interval between the two points where A 2 is equal to half its maximum height. It 

11 In this figure I chose to plot A 2 against co with co 0 fixed, rather than the other way around as 
in Figure 5.16. Note that the curves have very similar shapes either way. 
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Figure 5.17 The amplitude for driven oscillations as a function 
of the driving frequency co for three different values of the damp¬ 
ing constant /3. Note how as f3 decreases the resonance peak gets 
higher and sharper. 


is a simple exercise (Problem 5.41) to show that the two half-maximum points are at 
co fv co 0 ± yS, as in Figure 5.18. Thus the full width at half maximum is 

FWHM « 2/3 (5.75) 


or, equivalently, the half width at half maximum is 

HWHM « p. (5.76) 


The sharpness of the resonance peak is indicated by the ratio of its width 2ft to its 
position, co 0 . For many purposes, we want a very sharp resonance, so it is common 
practice to define a quality factor Q as the reciprocal of this ratio: 


Q 2p 


(5.77) 



Figure 5.18 The full width at half maximum (FWHM) is the 
distance between the points where A 2 is half its maximum value. 
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A large Q indicates a narrow resonance, and vice versa. For example, clocks depend 
on the resonance in an oscillator (a pendulum, or a quartz crystal, for instance) to 
regulate the mechanism to move with very well-defined frequency. This requires that 
the width 2/3 be very small compared to the natural frequency co 0 . In other words, a 
good clock needs a high Q. The Q for a typical pendulum may be around 100; that 
for a quartz crystal around 10,000. Therefore quartz watches keep much better time 
than a typical grandfather clock. 12 

There is another way to look at the quality factor Q. We saw that, in the absence 
of any driving force, the oscillations die out in a time of order 1//3, 

(decay time) = 1/(3. 

(This was actually the time for the amplitude to drop to 1/e of its initial value.) The 
period of a single oscillation is, of course, 

period = 2tt/o) 0 . 

(Remember we’re assuming that (3 <$C co 0 , so we don’t need to distinguish between co 0 
and ft);.) Thus we can rewrite the definition of Q as 

Q = ^=„ (5.78) 

2/3 2ti/co 0 period 

The ratio on the right is just the number of periods in the decay time. Thus, the quality 
factor Q is n times the number of cycles our oscillator makes in one decay time. 13 


The Phase at Resonance 


The phase difference 8 by which the oscillator’s motion lags behind the driving force 
is given by (5.65) as 


8 = arc tan 



(5.79) 


Let us follow this phase as we vary co, starting well below a narrow resonance (fi 
small). With co <$C co 0 , (5.79) implies that 8 is very small; that is, while co <SC co 0 the 
oscillations are almost perfectly in step with the driving force. (This was the case 
in Figure 5.15.) As co is increased toward co 0 , so 8 slowly increases. At resonance, 
where ft> = co Q , the argument of the arctangent in (5.79) is infinite, so 8 = n/2 and 
the oscillations are 90° behind the driving force. Once co > co 0 , the argument of the 


12 Actually, both quartz watches and grandfather clocks keep much better time than this simple 
discussion would suggest. A good chronometer keeps the frequency very close to the center of 
the resonance. Thus the variability of the frequency is actually much smaller than the width of the 
resonance. Nevertheless, the stated conclusion is correct. 

13 Yet another definition (and perhaps the most fundamental) is that Q = 2n times the ratio of 
the energy stored in the oscillator to the energy dissipated in one cycle. See Problem 5.44. 
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Figure 5.19 The phase shift 8 increases from 0 through tz/2 
to n as the driving frequency co passes through resonance. The 
narrower the resonance, the more suddenly this increase occurs. 
The solid curve is for a relatively narrow resonance (fi = 0.03<u o 
or Q — 16.7), and the dashed curve is for a wider resonance 
(fi = 0.3m o or Q = 1.67). 


arctangent is negative and approaches 0 as co increases; thus <5 increases beyond njl 
and eventually approaches n. In particular, once co » co 0 , the oscillations are almost 
perfectly out of step with the driving force. All of this behavior is illustrated for 
two different values of ft in Figure 5.19. Notice, in particular, that the narrower the 
resonance, the more quickly 8 jumps from 0 to n. 

In the resonances of classical mechanics, the behavior of the phase (as in Fig¬ 
ure 5.19) is usually less important than that of the amplitude (as in Figure 5.18). 14 In 
atomic and nuclear collisions, the phase shift is often the quantity of primary inter¬ 
est. Such collisions are governed by quantum mechanics, but there is a corresponding 
phenomenon of resonance. A beam of neutrons, for example, can “drive” a target 
nucleus. When the energy of the beam equals a resonant energy of the system (in 
quantum mechanics energy plays the role of frequency) a resonance occurs and the 
phase shift increases rapidly from 0 to n. 


5.7 Fourier Series* 


* Fourier series have broad application in almost every area of modem physics. Nevertheless, 
we shall not be using them again until Chapter 16. Thus, if you are pressed for time you could 
omit the last three sections of this chapter on a first reading. 

In the last two sections, we have discussed an oscillator that is driven by a sinusoidal 
driving force f(t) = f 0 cos(cot). There are two main reasons for the importance of 


14 The behavior of 8 can, nevertheless, be observed. Make a simple pendulum from a piece of 
string and a metal nut, and drive it by holding it at the top and moving your hand from side to side. 
The most obvious thing is that you will be most successful at driving it when your frequency equals 
the natural frequency, but you can also see that when you drive more slowly the pendulum moves 
in step with your hand, whereas when you move more quickly the pendulum moves oppositely to 
your hand. 
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Figure 5.20 Two examples of periodic functions with period r. (a) A rectangular 
pulse, which could represent a hammer hitting a nail with a constant force at 
intervals of r, or a digital signal in a telephone line, (b) A smooth periodic signal, 
which could be the pressure variation of a musical instrument. 


sinusoidal driving forces: The first is simply that there are many important systems 
in which the driving force is sinusoidal — the electrical circuit in a radio is a good 
example. The second is somewhat subtler. It turns out that any periodic driving force 
can be built up from sinusoidal forces using the powerful technique of Fourier series. 
Thus, in a sense that I shall try to describe, by solving the motion with a sinusoidal 
driver we have already solved the motion with any periodic driver. Before we can 
appreciate this wonderful result, we need to review some aspects of Fourier series. In 
this section I sketch the needed properties of Fourier series; 15 in the next we can apply 
them to the driven oscillator. 

Let us consider a function /(f) that is periodic with period r; that is, the function 
repeats itself every time t advances by the period r: 

f{t + r) = fit ) 

whatever the value of t. We can describe a function with this property as being r- 
periodic. A simple example of a r-periodic function would be the force exerted on a 
nail by a hammer that is being swung at intervals of r, as sketched in Figure 5.20(a). 
Another could be the pressure exerted on your ear drum by a note played by a musical 
instrument, as sketched in 5.20(b). It is easy to think up many more periodic functions. 
In particular, there are lots of sinusoidal functions that are periodic with any given 
period: The functions 

cos(27Tf/r), cos(47rf/r), cos(6;rf/r), • • • (5.80) 

are all r-periodic, as are the corresponding sine functions. (If t is increased by the 
amount r, each of these functions returns to its original value — see Figure 5.21.) 
We can write these sinusoidal functions a little more compactly if we introduce the 
angular frequency co = In/x, in which case all of the functions of (5.80) and the 
corresponding sines can be written as 

cos incot) and sin(nmf) [n = 0, 1,2, • • •]. (5.81) 


15 As usual, I shall try to describe all of the theory that we shall be needing. For more details, see, 
for example. Mathematical Methods in the Physical Sciences by Mary Boas (Wiley, 1983), Ch. 7. 
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Figure 5.21 Any function of the form cos(2n:rr/r) (or the corresponding sine) 
is periodic with period r if n is an integer. Notice that cos(47r?/r) also has the 
smaller period r/2, but this doesn’t change the fact that it has period r as well. 


(If n = 0 the cosine function is just the constant 1 — which is certainly periodic — 
while the sine is 0 and not at all interesting.) 

That the sine and cosine functions (5.81) are all r-periodic is reasonably obvious. 
(Be sure you can see this.) What is truly amazing is that, in a sense, these sine and 
cosine functions define all possible r-periodic functions: In 1807 the French math¬ 
ematician Jean Baptiste Fourier (1768-1830) realized that every r-periodic function 
can be written as a linear combination of the sines and cosines of (5.81). That is, if 
fit) is any 16 periodic function with period r then it can be expressed as the sum 


fit) = ^ K cosincot) + b„ $m(na>t )] 


(5.82) 


where the constants a n and b n depend on the function fit). This extraordinarily useful 
result is called Fourier’s theorem, and the sum (5.82) is called the Fourier series for 
fit). 

It is not hard to see why Fourier’s theorem met with considerable surprise, and 
even skepticism, when he first published it. It claims that a discontinous function, 
such as the rectangular pulse of Figure 5.20(a), can be built up with sine and cosine 
functions that are continuous and perfectly smooth. Surprising or not, this turns out 
to be true, as we shall see by example shortly. Perhaps even more surprising, it is 
often the case that one gets an excellent approximation by retaining just the first few 
terms of a Fourier series. Thus, instead of having to handle a fairly disagreeable and 
possibly discontinuous function, we have only to handle a reasonably small number 
of sines and cosines. Before we discuss the application of Fourier’s theorem to the 
driven oscillator, we need to look at a few properties of Fourier series. 

The proof of Fourier’s theorem is difficult — indeed it was many years after 
Fourier’s discovery before a satisfactory proof was found — and I shall simply ask 
you to accept it. However, once the result is accepted it is easy to learn to use it. In 


16 As always with theorems of this kind, there are certain restrictions on the “reasonableness” 
of the function fit), but certainly Fourier’s theorem is valid for all of the functions we shall have 
occasion to use. 
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particular, for any given periodic function f(t ) it is easy to find the coefficients a n 
and b n . Problem 5.48 gives you the opportunity to show that these coefficients are 
given by 


2 f T/1 

«„ = - / f(t)cos(nwt)dt [n > 1] (5.83) 

t J-t/2 


and 


2 C t/l 

b n = - f(t) $in(na>t) dt [n > Ij. (5.84) 

r J—t /2 


Unfortunately the coefficients for n = 0 require separate attention. Since the term 
sin ncot in (5.82) is identically zero for n = 0, the coefficient b 0 is irrelevant and we 
can simply define it to be zero. It is very easy to show (Problem 5.46) that 


1 f T/2 

*o=~ / (5.85) 

r J—t /2 . 


Armed with these formulas for the Fourier coefficients, it is easy to find the Fourier 
series for any given periodic function. In the following example, we do this for the 
rectangular pulses of Figure 5.20(a). 


example 5.4 Fourier Series for the Rectangular Pulse 

Find the Fourier series for the periodic rectangular pulse f(t) shown in Figure 
5.22 in terms of the period r, the pulse height / max , and the duration of the pulse 
Ar. Using the values x — l, / max = 1, and Ar = 0.25, plot fit), as well as the 
sum of the first three terms of its Fourier series, and the sum of the first eleven 
terms. 

Our first task is to calculate the Fourier coefficients a n and b n for the given 
function. First according to (5.85) the constant term a 0 is 

i r ' 2 

a 0 =~ f(t)dt 
X J—t/2 

=-r /2 /^<=— 

X J— At/2 X 


(5.86) 
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Figure 5.22 A periodic rectangular pulse. The period is r, the 
duration of the pulse is Ar, and the pulse height is / max . 


\ where the change in the limits of integration was allowed because the integrand 
j f(t) is zero outside ± Ar/2. Next, according to (5.83), all the other a coefficients 
j (n > 1) are given by 

j 2 f r / 2 

1 a n = — I f (t) cos(ncot) dt 

| t J- t /2 



Notice that in passing from the second to the third line, I used a trick that is 
often useful in evaluating Fourier coefficients. The integrand on the second line, 
cos (ncot), is an even function; that is, it has the same value at any point t as at 
—t. Therefore we could replace any integral from —T to T by twice the integral 
from 0 to T. 

Finally the b coefficients are all exactly zero, for if you examine the integral 
(5.84) you will see that (in this case) the integrand is an odd function; that is, its 
value at any point t is the negative of its value at —t. [Moving from t to—f leaves 
f(t ) unchanged, but reverses the sign of sin(nwt).] Therefore any integral from 
—T to T is zero, since the left half exactly cancels the right. 

The required Fourier series is therefore 

f{t) = a 0 + ^ a n cos(ncot) (5.88) 

n=1 

with the constant term a 0 given by (5.86) and all the remaining a coefficients 
(n > 1) by (5.87). If we put in the given numbers, these coefficients can all be 
evaluated and the resulting Fourier series is 

f{t) = /max [0-25 + 0.24cos(2n7) + 0.23cos(4tt0 + 0.20cos(6tt0 

+ 0.16cos(87rf) + 0.12cos(107rr) + 0.08cos(12^t) + • • • ] (5.89) 
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Figure 5.23 (a) The sum of the first three terms of the Fourier series for the 
rectangular pulse of Figure 5.22. (b) The sum of the first 11 terms. 


The practical value of a Fourier series is usually greatest if the series con¬ 
verges rapidly, so that we get a reliable approximation by retaining just the first 
few terms of the series. Figure 5.23(a) shows the sum of the first three terms of 
the series (5.89) and the rectangular pulse itself. As you might expect, with just 
three smooth terms we do not get a sensationally accurate approximation to the 
original discontinuous function. Nevertheless, the three terms do a remarkable 
job of imitating the general shape. By the time we have included 11 terms, as in 
Figure 5.23(b), the fit is amazing. 17 In the next section, we shall use the method 
of Fourier series to solve for the motion of an oscillator driven by the periodic 
pulses of this example. We shall find the solution as a Fourier series, which con¬ 
verges so quickly that just the first 3 or 4 terms tell us most of what is interesting 
to know. 


5.8 Fourier Series Solution for the Driven Oscillator* 


* This section contains a beautiful application of the method of Fourier series. Important as it 
is to understand this method, you can nevertheless omit the section without loss of continuity. 

In this section we shall combine our knowledge of Fourier series (Section 5.7) with 
our solution of the sinusoidally driven oscillator (Section 5.5) to solve for the motion 
of an oscillator that is driven by an arbitrary periodic driving force. To see how this 
works, let us return to the equation of motion (5.48) 

x 4- 2fix + co^x = / 

where x = x(t) is the position of the oscillator, f> is the damping constant, co 0 is the 
natural frequency, and / = fit) is any periodic driving force (actually force/mass) 
with period t. As before it is convenient to rewrite this in the compact form 

Dx = f 


17 Notice, however, that the Fourier series still has a little difficulty in the immediate neigh¬ 
borhood of the discontinuities in f(t). This tendency for the Fourier series to overshoot near a 
discontinuity is called the Gibbs phenomenon. 
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where D stands for the linear differential operator 


D = ^ +2fs J, + <- 


The use of Fourier series to solve this problem hinges on the following observation: 
Suppose that the force fit) is the sum of two forces, fit) = fit) + f 2 (t), for each 
of which we have already solved the equation of motion. That is, we already know 
functions xft) and x 2 (t) that satisfy 


Dx j = f x and Dx 2 = f 2 . 

Then the solution 18 to the problem of interest is just the sum x(t) = X\ (t) + x 2 (t), as 
we can easily show: 

Dx = D(xx + x 2 ) = Dx x + Dx 2 = ,/j + / 2 - ./' 

where the crucial second step is valid because D is linear. This argument would work 
equally well however many terms were in the sum for f{t), so we have the conclusion: 
If the driving force f(t) can be expressed as the sum of any number of terms 

/»-£/„« 

and if we know the solutions x„(t) for each of the individual forces f n (t), then the 
solution for the total driving force fit) is just the sum 

x(t) = X>„(4 

This result is ideally suited for use in combination with Fourier’s theorem. Any 
periodic driving force fit) can be expanded in a Fourier series of sines and cosines, 
and we already know the solutions for sinusoidal driving forces. Thus, by adding these 
sinusoidal solutions together, we can find the solution for any periodic driving force. 
To simplify our writing, let us suppose that the driving force fit) contains only cosine 
terms in its Fourier series. [This was the case for the rectangular pulse of Example 5.4, 
and is true for any even function — satisfying /(— t) = fit) — because this condition 
guarantees that the coefficients of the sine terms are all zero.] In this case, the driving 
force can be written as 


n =0 


(5.90) 


18 Strictly speaking we should not speak of the solution, since our second-order differential 
equation has many solutions. However, we know that the difference between any two solutions is 
transient — decays to zero — and our main interest is in the long-term behavior, which is therefore 
essentially unique. 
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where f n denotes the nth Fourier coefficient of /(f), and co = In/x as usual. Now, 
each individual term /„ cos {mot) has the same form (5.56) that we assumed for the 
sinusoidal driving force in Section 5.5 (except that the amplitude / 0 has become f n and 
the frequency co has become nco). The corresponding solution was given in (5.66), 19 

x n (t) = A n cos (ncot — 8 n ) (5.91) 


where 


from (5.64), and 


A n 


_ fn _ 

yj{ool — n 2 co 2 ) 2 + 4fi 2 n 2 co 2 


(5.92) 


8 n — arctan 



(5.93) 


from (5.65). Since (5.91) is the solution for the driving force f n cos {ncot), the solution 
for the complete force (5.90) is the sum 


x(t) = A n cos(ncot ~ 8„). 


(5.94) 


This completes the solution for the long-term motion of an oscillator driven by a 
periodic driving force /(f)- To summarize, the steps are: 

1. Find the coefficients f n in the Fourier series (5.90) for the given driving force 
fit). 

2. Calculate the quantities A n and 8 n as given by (5.92) and (5.93). 

3. Write down the solution x(t) as the Fourier series (5.94). 

In practice, one usually needs to include surprisingly few terms of the solution (5.94) 
to get a satisfactory approximation, as the following example illustrates. 20 


example 5.5 An Oscillator Driven by a Rectangular Pulse 

Consider a weakly damped oscillator that is being driven by the periodic rectan¬ 
gular pulses of Example 5.4 (Figure 5.22). Let the natural period of the oscillator 
be t 0 = 1 , so that the natural frequency is co 0 = 2tt, and let the damping constant 


19 The n — 0, constant term needs separate consideration. It is easy to see that for a constant 
force / 0 the solution is jc 0 = fo/co 2 . This is actually exactly what you get if you just set n = 0 in 
(5.92) and (5.93). 

20 The solution contained in Equations (5.92) to (5.94) can be written more compactly if you 
don’t mind using complex notation. See Problem 5.51. 
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j be /J = 0.2. Let the pulse last for a time At = 0.25 and have a height / max = 1. 
Calculate the first six Fourier coefficients A n for the long-term motion x(t) of the 
oscillator, assuming first that the drive period is the same as the natural period, 
r = r 0 = 1. Plot the resulting motion for several complete oscillations. Repeat 
these exercises for x = 1.5r 0 ,2.0r o , and 2.5r 0 . 

Before we look at any of these exercises, it is worth thinking of a real system 
this problem might represent. One simple possibility is a mass hanging from the 
j end of a spring, to which a professor is applying regularly spaced upward taps 
j at intervals x. An even more familiar example would be a child in a swing, to 
whom a parent is giving regularly spaced pushes — though in this case we need 
j to be careful to keep the amplitude small to justify the use of Hooke’s law. We 
| are told to start by taking x = x 0 = 1; that is, the parent is pushing the child at 
j exactly the natural frequency. 

j The Fourier coefficients /„ of the driving force were already calculated in 
Equations (5.86) and (5.87) of Example 5.4 (where they were called a n ). If we 
I substitute these into (5.92) for the coefficients A n and put in the given numbers 
(including x = x 0 = 1), we find for the first six Fourier coefficients A 0 , • • •, A 5 : 

I A 0 A l A 2 A 3 A 4 A 5 

j 63 1791 27 5 0 1 

j (Since the numbers are rather small, I have quoted the values multiplied by 10 4 ; 
that is, A 0 = 63 x 10 -4 , A x = 1791 x 1CT 4 , and so on.) Two things stand out 
about these numbers: First, after A h they get rapidly smaller, and for almost all 
purposes it would be an excellent approximation to ignore all but the first three 
terms in the Fourier series for x(t). Second, the coefficient A, is vastly bigger 
than all the rest. This is easy to understand if you look at (5.92) for the coefficient 
A,: Since co = co 0 (remember the parent is pushing the child at the swing’s 
natural frequency) and n = 1, the first term in the denominator is exactly zero, 
j the denominator is anomalously small, and A l is anomalously big compared 
| to all the other coefficients. In other words, when the driving frequency is the 
j same as the natural frequency, the n = 1 term in the Fourier series for x(t) is at 
j resonance, and the oscillator responds especially strongly with frequency co 0 . 

Before we can plot x{t) as given by (5.94), we need to calculate the phase 
| shifts S n using (5.93). This is easily done, though I shan’t waste space displaying 
the results. We cannot actually plot the infinite series (5.94); instead, we must 
pick some finite number of terms with which to approximate x (t ). In the present 
case, it seems clear that three terms would be plenty, but to be on the safe side 
I’ll use six. Figure 5.24 shows x(t) as approximated by the sum of the first 
six terms in (5.94). At the scale shown, this approximate graph is completely 
indistinguishable from the exact result, which in turn is indistinguishable from 
a pure cosine 21 with frequency equal to the natural frequency of the oscillator. 
The strong response at the natural frequency is just what we would expect. For 


21 Actually it’s a pure sine, but this is really cos (<ot — <5j) with = 7 t/ 2 as we should have 
expected because we’re exactly on resonance. 
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Figure 5.24 The motion of a linear oscillator, driven by periodic 
rectangular pulses, with the drive period r equal to the natural 
period r 0 of the oscillator (and hence co = co Q ). The horizontal 
axis shows time in units of the natural period r 0 . As expected the 
motion is almost perfectly sinusoidal, with period equal to the 
natural period. 


instance, anyone who has pushed a child on a swing knows that the most efficient 
way to get the child swinging high is to administer regularly spaced pushes at 
intervals of the natural period — that is, r = r 0 — and that the swing will then 
oscillate vigorously at its natural frequency. 

A driving force with any other period r can be treated in exactly the same 
way. The Fourier coefficients A 0 , • • •, A 5 for all of the values of r requested 
above are shown in Table 5.1. 


Table 5.1 The first six Fourier coefficients A n for the motion x(t) 
of a linear oscillator driven by periodic rectangular pulses, for four 
different drive periods r = r 0 ,1.5r 0 , 2.0r o , and 2.5r 0 . All values have 
been multiplied by 10 4 . 



^0 


a 2 

^3 

a 4 

^5 

T - 1.0 T 0 

63 

1791 

27 

5 

0 

1 

X = 1.5 T 0 

42 

145 

89 

18 

6 

2 

T = 2.0 T 0 

32 

82 

896 

40 

13 

6 

T = 2.5 r o 

25 

59 

130 

97 

25 

11 


The entries in the four rows of this table deserve careful examination. The 
first row (r == r 0 ) shows the coefficients already discussed, the most prominent 
feature of which is that the n = 1 coefficient is far the largest, because it is 
exactly on resonance. In the next row (r = 1.5r 0 ), the n = 1 Fourier component 
has moved well away from resonance, and A x has dropped by a factor of 12 
or so. Some of the other coefficients have increased a bit, but the net effect is 
that the oscillator moves much less than when r = r 0 . This is clearly visible in 
Figures 5.25(a) and (b), which show x(t) (as approximated by the first six terms 
of its Fourier series) for these two values of the drive period. 
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Figure 5.25 The motion of a linear oscillator, driven by periodic rectangular pulses, showing four 
different values of the drive period r. (a) When the oscillator is driven at its natural frequency (r = r 0 ), 
the n = 1 term of the Fourier series is at resonance and the oscillator responds energetically, (b) When 
r =? 1.5t 0 , the response is feeble, (c) When r = 2.0r o , the n = 2 term is at resonance and the response 
is strong again, (d) When r = 2.5t 0 , the response is again weak. 


The third row of Table 5.1 shows the Fourier coefficients for a drive period 
equal to twice the natural period — the parent is pushing just once every second 
swing of the child. Now, with r = 2.0r o , the drive frequency is half the natural 
frequency (co = i<w 0 ). This means that the n = 2 Fourier component, with fre¬ 
quency 2 co = co 0 , is exactly on resonance, and the coefficient A 2 is anomalously 
large. Once again we get a large response, as seen in Figure 5.25(c). 

Let us look a little closer at the case of Figure 5.25(c). It is, of course, a 
matter of experience that a perfectly satisfactory way to get a child swinging 
is to push once every two swings, although this naturally doesn’t get quite the 
result of pushing once for every swing. If you look carefully at Figure 5.25(c), 
you will notice that the swings alternate in size — the even swings are slightly 
bigger than the odd ones. This too is to be expected: Because the oscillator is 
damped, the second swing after each push is bound to be a little smaller than 
the first. 

Finally, when r = 2.5r 0 , the n = 2 Fourier component is well past resonance 
and A 2 is much smaller again. On the other hand, the n = 3 component is 
approaching resonance so that A 3 is getting bigger. In fact, A 2 and A 3 are roughly 
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the same size, so that x (t ) contains two dominant Fourier components and shows 
the somewhat kinky behavior seen in Figure 5.25(d). (Similar considerations 
apply to the case r = 1.5r 0 , where the coefficients A x and A 2 are both fairly 
large.) 


5.9 The RMS Displacement; Parseval’s Theorem* 


* The Parseval relation, which we introduce and apply in this section, is one of the most useful 
properties of Fourier series. Nevertheless, you could omit this section if pressed for time. 

In the last section we studied how the response of an oscillator varied with the 
frequency of the applied periodic driving force. We did this by solving for the motion 
x (t), using the method of Fourier series, for each of several interesting applied 
frequencies. It would be convenient if we could find a single number to measure the 
oscillator’s response and then just plot this number against the driving frequency (or 
driving period). In fact there are several ways to do this. Perhaps the most obvious 
thing to try would be the oscillator’s average displacement from equilibrium, {x ). (I am 
using angle brackets () to indicate a time average.) Unfortunately, since the oscillator 
spends as much time in any region of positive x as in the corresponding region of 
negative x, the average (x) is zero. 22 To get around this difficulty the most convenient 
quantity to use is the mean square displacement (x 2 ), and to give a quantity with the 
dimensions of length we usually discuss the root mean square or RMS displacement 

*rms = V^>- (5-95) 

The definition of the time average needs a little care. The usual practice is to define 
() as the average over one period x. Thus, 

1 f T/2 

(x 2 ) = -/ x 2 dt. (5.96) 

r J-r /2 

Because the motion is periodic, this is the same as the average over any integer number 
of periods, and hence also the average over any long time. (If this isn’t clear to you, 
see Problem 5.54.) 

To evaluate the average (x 2 ), we use the Fourier expansion (5.94) of x(t) 


x(t) = A n cos {neat — 8 n ). (5.97) 

n =0 

(In general, this series will contain sines as well as cosines, but in Example 5.5 the 
driving force contained only cosines, and for simplicity let us continue to assume this 


22 Not quite true. The small constant term A 0 in (5.94) contributes a nonzero average (x) = A 0 , 
but this does not reflect any oscillation, which is what we are trying to characterize. 
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is the case. For the general case, see Problem 5.56.) Substituting for each of the factors 
x in (5.96), we get the appalling double sum 




A n cos (ncot — 8 n )A m cos (mcot — 8 m ) dt. 


(5.98) 


Fortunately, this simplifies dramatically. It is fairly easy to show (Problem 5.55) that 
the integral is just 


£ 


cos {ncot — 8 n ) cos (mcot — 8 m ) dt -- 


t if m = n = 0 
r/2 if m = n ^ 0 
0 if m ^ n. 


(5.99) 


Thus, in the double sum (5.98), only those terms with m = n need to be retained, and 
we get the surprisingly simple result that 


<* 2 > = A? + 5 XX- 


(5.100) 


This relation is called Parseval’s theorem. 23 It has many important theoretical uses, 
but for our purposes its main application is this: Since we know how to calculate 
the coefficients A n , Parseval’s theorem lets us find the response (x 2 ) of our oscillator. 
Moreover, by dropping all but some modest finite number of terms in the sum (5.100), 
we get an excellent and easily calculated approximation for (x 2 ), as the following 
example illustrates. 


example 5.6 The RMS Displacement for a Driven Oscillator 

Consider again the oscillator of Example 5.5, driven by the periodic rectangular 
pulses of Example 5.4 (Figure 5.22). Find the RMS displacement = yj\x 2 ) 
as given by (5.100) for this oscillator. Using the same numerical values as 
before (r 0 .= 1, jS = 0.2, / max = 1, Ar = 0.25) and approximating (5.100) by 
its first six terms, make a plot of as a function of the drive period r for 
0.25 < r < 5.5. 

We have already done all the calculations needed to write down a formula 
for x rms = y/{x 2 ). First, (x 2 ) is given by (5.100), where the Fourier coefficients 
A n are given by (5.92) as 

A n = fn - (5.101) 

y (a> 2 — n 2 (o 2 ) 2 + 4 fi 2 n 2 co 2 


23 Remember that we made the simplifying assumption that our Fourier series contained only 
cosine terms. In general, the sum in (5.100) must include contributions B 2 from the sine terms as 
well. See Problem 5.56. 
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Figure 5.26 The RMS displacement of a linear oscillator, driven by 
periodic rectangular pulses, as a function of the drive period r — 
calculated using the first six terms of the Parseval expression (5.100). 
The horizontal axis shows r in units of the natural period r 0 . When r 
is an integral multiple of r 0 the response is especially strong. 


and the Fourier coefficients f n of the driving force are given by (5.87) and (5.86) 
as 


f n m sin (, [forn > 1] (5.102) 

nn \ r / 


while / 0 = / max Ar/r. Putting all these together gives the desired formula for 
jCrm S (which I’ll leave you to write down if you want to see it). 

If we now put in the given numbers, we are left with just one independent 
variable, the period of the driving force r. (Remember that co = 2n/r.) Trun¬ 
cating the infinite series (5.100) after six terms, we arrive at an expression that is 
easily evaluated with the appropriate software (or even a programmable calcula¬ 
tor) and plotted as shown in Figure 5.26. This graph shows clearly and succinctly 
what we found in the previous example. As we increase the drive period r, the 
response of the oscillator varies dramatically. Each time r passes through an 
integer multiple of the natural period r 0 (that is, r = nr 0 ), the response exhibits 
a sharp maximum, because the nth Fourier component is at resonance. On the 
other hand, each successive peak is lower than its predecessor, since we elected 
to fix the width Ar and height / max of the pulses; thus as the drive period gets 
longer, the net effect of the force would be expected to get less. 


Principal Definitions and Equations of Chapter 5 

Hooke’s Law 


F = -kx 


U = \kx 2 


[Section 5.1] 
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Simple Harmonic Motion 

x = —co 2 x <==> x(t) = Acos(cot — 8 ), etc. [Section 5.2] 


Damped Oscillations 

If the oscillator is subject to a damping force —bv, then 

x + 2/3x + co 2 x = 0 <£=» x(t) = Ae~P l cos (co x t — 5) [Eqs. (5.28) & (5.38)] 

where /? = 2b/m, co Q = k/m , co i = — fi 2 , and the solution given here is for 

“weak damping” (fi < co 0 ). 


Driven Damped Oscillations and Resonance 

If the oscillator is also subject to a sinusoidal driving force F(t ) = mf 0 cos(cot), the 
long-term motion has the form 


x(t) = A cos (cot — 5) 


[Eq. (5.66)] 


where 


A 2 


/o 2 

(i co ^ — co 2 ) 2 + 4 /3 2 cd 2 


[Eq. (5.64)] 


and the phase shift 8 is given by (5.65). To this solution can be added a “transient” 
solution of the corresponding homogeneous equation, but this dies out as time passes. 
The long-term solution “resonates” (has a sharp maximum) when co is close to co 0 . 


Fourier Series 

If the driving force is not sinusoidal, but is still periodic, it can be built up as a Fourier 
series of sinusoidal terms, as in (5.90), and the resulting motion is the corresponding 
series of sinusoidal solutions, as in (5.94): 

x(t) =c Y A n cos (ncot - 8 n ). [Eq. (5.94)] 

n= 0 



x 2 dt 


The RMS Displacement 

The root mean square displacement 


[Eqs. (5.95) & (5.96)] 



Problems for Chapter 5 


207 


is a good measure of the average response of the oscillator and is given by Parseval’s 
theorem as 


A o + 5X>«- [Eq. (5.100)] 

N »=i 


Problems for Chapter 5 __ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult ( 

section 5.1 Hooke’s Law 

5.1 * A massless spring has unstretched length l 0 and force constant k. One end is now attached to the 
ceiling and a mass m is hung from the other. The equilibrium length of the spring is now f. (a) Write 
down the condition that determines l v Suppose now the spring is stretched a further distance x beyond 
its new equilibrium length. Show that the net force (spring plus gravity) on the mass is F — ~kx. That 
is, the net force obeys Hooke’s law, when x is the distance from the equilibrium position — a very useful 
result, which lets us treat a mass on a vertical spring just as if it were horizontal, (b) Prove the same 
result by showing that the net potential energy (spring plus gravity) has the form U (x) = const + {kx 1 . 

5.2 * The potential energy of two atoms in a molecule can sometimes be approximated by the Morse 
function, 


f Hr) = A n/s - l) 2 - lj 

where r is the distance between the two atoms and A, R, and S are positive constants with S <$C R. Sketch 
this function for 0 < r < oo. Find the equilibrium separation r Q , at which U (r ) is minimum. Now write 
r =r Q + x so that x is the displacement from equilibrium, and show that, for small displacements, U 
has the approximate form U = const + t kx 2 . That is, Hooke’s law applies. What is the force constant kl 

5.3 * Write down the potential energy U (0) of a simple pendulum (mass m, length /) in terms of the 
angle 0 between the pendulum and the vertical. (Choose the zero of U at the bottom.) Show that, for 
small angles, U has the Hooke’s law form U (0) = \kf> 2 , in terms of the coordinate 0. What is kl 

5.4 *★ An unusual pendulum is made by fixing a string to a horizontal cylinder of radius R, wrapping 
the string several times around the cylinder, and then tying a mass m to the loose end. In equilibrium 
the mass hangs a distance l 0 vertically below the edge of the cylinder. Find the potential energy if the 
pendulum has swung to an angle 0 from the vertical. Show that for small angles, it can be written in 
the Hooke’s law form U = \kf> 2 . Comment on the value of k. 


section 5.2 Simple Harmonic Motion 

5.5 * In Section 5.2 we discussed four equivalent ways to represent simple harmonic motion in one 
dimension: 
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— C\e l01t -p C 2 e (a>f 

(I) 

= fijcos^r) + B 2 sin(<t>r) 

(II) 

- A cos (cot — 8) 

(III) 

- Re Ce lcot 

(IV) 


To make sure you understand all of these, show that they are equivalent by proving the following 
implications: I =4> II =$► III =>■ IV =$■ I. For each form, give an expression for the constants (C h C 2 , etc.) 
in terms of the constants of the previous form. 

5.6 ★ A mass on the end of a spring is oscillating with angular frequency a>. At t = 0, its position is 
x 0 > 0 and I give it a kick so that it moves back toward the origin and executes simple harmonic motion 
with amplitude 2x 0 . Find its position as a function of time in the form (III) of Problem 5.5. 

5.7 * (a) Solve for the coefficients B 1 and B 2 of the form (II) of Problem 5.5 in terms of the initial 
position x 0 and velocity v () at t = 0. (b) If the oscillator’s mass is m = 0.5 kg and the force constant 
is k = 50 N/m, what is the angular frequency col If x 0 = 3.0 m and v 0 = 50 m/s, what are B ] and B{1 
Sketch x(t) for a couple of cycles, (c) What are the earliest times at which x = 0 and at which x = 0? 

5.8 * (a) If a mass m = 0.2 kg is tied to one end of a spring whose force constant k — 80 N/m and 
whose other end is held fixed, what are the angular frequency &>, the frequency /, and the period r of its 
oscillations? (b) If the initial position and velocity are x 0 = 0 and v Q = 40 m/s, what are the constants 
A and S in the expression x(t) = A cos (cot — 5)? 

5.9 * The maximum displacement of a mass oscillating about its equilibrium position is 0.2 m, and its 
maximum speed is 1.2 m/s. What is the period r of its oscillations? 

5.10 * The force on a mass m at position x on the x axis is F = —F 0 sinhax, where F 0 and a are 
constants. Find the potential energy U(x), and give an approximation for U (x) suitable for small 
oscillations. What is the angular frequency of such oscillations? 

5.11 * You are told that, at the known positions x, and x 2 , an oscillating mass m has speeds tq and v 2 . 
What are the amplitude and the angular frequency of the oscillations? 

5.12 ** Consider a simple harmonic oscillator with period r. Let (/) denote the average value of any 
variable fit), averaged over one complete cycle: 

</> = - f fit) dt. (5.103) 

t Jo 

Prove that (T) = (U) = \E where E is the total energy of the oscillator. [Hint: Start by proving the 
more general, and extremely useful, results that (sin 2 (o>r — 8 )) = (cos 2 (cot — <5)) = \ . Explain why 
these two results are almost obvious, then prove them by using trig identities to rewrite sin 2 0 and 
cos 2 0 in terms of cos (29).] 

5.13 *★ The potential energy of a one-dimensional mass m at a distance r from the origin is 

l/M-P. (£ + *?) 

for 0 < r < oo, with U 0 , R, and k all positive constants. Find the equilibrium position r 0 . Let x be the 
distance from equilibrium and show that, for small x, the PE has the form U = const + {kx 2 . What is 
the angular frequency of small oscillations? 
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k- a -- a- - 4 

Figure 5.27 Problem 5.18 

section 5.3 Two-Dimensional Oscillators 

5.14 * Consider a particle in two dimensions, subject to a restoring force of the form (5.21). (The two 
constants k x and k y may or may not be equal; if they are, the oscillator is isotropic.) Prove that its 
potential energy is 

U = \{k x x 2 + k y y 2 ). (5.104) 

5.15 * The general solution for a two-dimensional isotropic oscillator is given by (5.19). Show that 
by changing the origin of time you can cast this in the simpler form (5.20) with 8 = S y — 8 X . [Hint: A 
change of origin of time is a change of variables from t to t' = t + t 0 . Make this change and choose 
the constant t 0 appropriately, then rename t' to be t.] 

5.16* Consider a two-dimensional isotropic oscillator moving according to Equation (5.20). Show 
that if the relative phase is 8 = jr/2, the particle moves in an ellipse with semimajor and semiminor 
axes A x and A y . 

5.17** Consider the two-dimensional anisotropic oscillator with motion given by Equation (5.23). 
(a) Prove that if the ratio of frequencies is rational (that is, (o x !a) y = p/q where p and q are integers) 
then the motion is periodic. What is the period? (b) Prove that if the same ratio is irrational, the motion 
never repeats itself. 

5.18 ★** The mass shown from above in Figure 5.27 is resting on a frictionless horizontal table. Each 
of the two identical springs has force constant k and unstretched length Z 0 . At equilibrium the mass rests 
at the origin, and the distances a are not necessarily equal to l Q . (That is, the springs may already be 
stretched or compressed.) Show that when the mass moves to a position (x, y), with x and y small, the 
potential energy has the form (5.104) (Problem 5.14) for an anisotropic oscillator. Show that if a < l Q 
the equilibrium at the origin is unstable and explain why. 

5.19 *** Consider the mass attached to four identical springs, as shown in Figure 5.7(b). Each spring 
has force constant k and unstretched length / 0 , and the length of each spring when the mass is at its 
equilibrium at the origin is a (not necessarily the same as / 0 ). When the mass is displaced a small 
distance to the point (x, y), show that its potential energy has the form \k!r 2 appropriate to an isotropic 
harmonic oscillator. What is the constant k' in terms of k ? Give an expression for the corresponding 
force. 

section 5.4 Damped Oscillations 

5.20 * Verify that the decay parameter (i — J j3 2 — a> 2 for an overdamped oscillator (fi > co 0 ) de¬ 
creases with increasing ft. Sketch its behavior for co 0 < f5 < oo. 

5.21* Verify that the function (5.43), x{t) — te~^, is indeed a second solution of the equation of 
motion (5.28) for a critically damped oscillator (/3 = <y 0 ). 

5.22 * (a) Consider a cart on a spring which is critically damped. At time t — 0, it is sitting at its 
equilibrium position and is kicked in the positive direction with velocity v Q . Find its position x(t) for 
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all subsequent times and sketch your answer, (b) Do the same for the case that it is released from rest at 
position x = x 0 . In this latter case, how far is the cart from equilibrium after a time equal to r 0 = 2 itjco 0 , 
the period in the absence of any damping? 

5.23 * A damped oscillator satisfies the equation (5.24), where F dmp = —bx is the damping force. Find 
the rate of change of the energy E = \mx 2 + \kx 2 (by straightforward differentiation), and, with the 
help of (5.24), show that dE/dt is (minus) the rate at which energy is dissipated by F dmp . 

5.24 ★ In our discussion of critical damping (fi = o> 0 ), the second solution (5.43) was rather pulled 
out of a hat. One can arrive at it in a reasonably systematic way by looking at the solutions for 

< co 0 and carefully letting /3 -> co Q , as follows: For fi < a> 0 , we can write the two solutions as 
x x (t) = cosily) and x 2 (t) = sin(6j 1 t).'Show that as fi -> co 0 , the first of these approaches 
the first solution for critical damping, x x (t) = Unfortunately, as P -» co Q , the second of them 
goes to zero. (Check this.) However, as long as /3 ^ &> 0 , you can divide x 2 {t) by co l and you will still 
have a perfectly good second solution. Show that as ft —* co 0 , this new second solution approaches the 
advertised te~P l . 

5.25 ** Consider a damped oscillator with f5 < co 0 . There is a little difficulty defining the “period” r x 
since the motion (5.38) is not periodic. However, a definition that makes sense is that x x is the time 
between successive maxima of x(t). (a) Make a sketch of x(t) against t and indicate this definition of 
x on your graph. Show that x x = 2 Tt/co x . (b) Show that an equivalent definition is that r x is twice the 
time between successive zeros of x(t). Show this one on your sketch, (c) If /3 = co 0 /2, by what factor 
does the amplitude shrink in one period? 

5.26 ★★ An undamped oscillator has period r G = 1.000 s, but I now add a little damping so that its 
period changes to r x = 1.001 s. What is the damping factor /3? By what factor will the amplitude of 
oscillation decrease after 10 cycles? Which effect of damping would be more noticeable, the change 
of period or the decrease of the amplitude? 

5.27 *★ As the damping on an oscillator is increased there comes a point when the name “oscillator” 
seems barely appropriate, (a) To illustrate this, prove that a critically damped oscillator can never pass 
through the origin x = 0 more than once, (b) Prove the same for an overdamped oscillator. 

5.28 ★* A massless spring is hanging vertically and unloaded, from the ceiling. A mass is attached to 
the bottom end and released. How close to its final resting position is the mass after 1 second, given that 
it finally comes to rest 0.5 meters below the point of release and that the motion is critically damped? 

5.29 *★ An undamped oscillator has period r 0 = 1 second. When weak damping is added, it is found 
that the amplitude of oscillation drops by 50% in one period r x . (The period of the damped oscillations 
is defined as the time between successive maxima, r x = 2 tx/o) x . See Problem 5.25.) How big is fi 
compared to o> 0 ? What is r {! 

5.30** The position x(t) of an overdamped oscillator is given by (5.40). (a) Find the constants C x 
and C 2 in terms of the initial position x 0 and velocity v 0 . (b) Sketch the behavior of x(t) for the two 
cases that v 0 = 0 and that x 0 = 0. (c) To illustrate again how mathematics is sometimes cleverer than 
we (and check your answer), show that if you let /i -> 0, your solution for x(t) in part (a) approaches 
the correct solution for undamped motion. 

5.31 *★ [Computer] Consider a cart on a spring with natural frequency a) 0 = 2n, which is released 
from rest at x Q = 1 and t = 0. Using appropriate graphing software, plot the position x(t) for 0 < t < 2 
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and for damping constants fi = 0, 1, 2, 4, 6, 2n, 10, and 20. [Remember that x(t) is given by different 
formulas for yS < co Q , = co Q , and /3 > a> 0 .] 

5.32 *★ [Computer] Consider an underdamped oscillator (such as a mass on the end of a spring) that 
is released from rest at position x 0 at time t = 0. (a) Find the position x(t) at later times in the form 

x (t) — e~^[BYCOs(coit) + fi 2 sin(o)i0]- 

That is, find B x and B 2 in terms of x 0 . (b) Now show that if you let fi approach the critical value co 0 , 
your solution automatically yields the critical solution, (c) Using appropriate graphing software, plot 
the solution for 0 < t < 20, with x 0 = 1, co 0 = 1, and /J = 0, 0.02, 0.1, 0.3, and 1. 

section 5.5 Driven Damped Oscillations 

5.33* The solution for x(t ) for a driven, underdamped oscillator is most conveniently found in the 
form (5.69). Solve that equation and the corresponding expression for x, to give the coefficients B 1 
and B 2 in terms of A, <5, and the initial position and velocity x 0 and v 0 . Verify the expressions given in 
(5.70). 

5.34 * Suppose that you have found a particular solution x p (t) of the inhomogeneous equation (5.48) 
for a driven damped oscillator, so that Dx p = f in the operator notation of (5.49). Suppose also 
that x(t) is any other solution, so that Dx = /. Prove that the difference x — x p must satisfy the 
corresponding homogeneous equation, D (x — x p ) = 0. This is an alternative proof that any solution x of 
the inhomogeneous equation can be written as the sum of your particular solution plus a homogeneous 
solution; that is, x = x p + x h . 

5.35 ** This problem is to refresh your memory about some properties of complex numbers needed 
at several points in this chapter, but especially in deriving the resonance formula (5.64). (a) Prove that 
any complex number z = x + iy (with x and y real) can be written as z = re 19 where r and 0 are the 
polar coordinates of z in the complex plane. (Remember Euler’s formula.) (b) Prove that the absolute 
value of z, defined as |z| = r, is also given by |z| 2 = zz*, where z* denotes the complex conjugate of z, 
defined as z* = x — iy. (c) Prove that z* = re~ lG . (d) Prove that (zw)* — z*w* and that (1/z)* = 1/z*. 
(e) Deduce that if z = a/(b + ic), with a, b, and c real, then |z| 2 = a 2 /{b 2 + c 2 ). 

5.36 ** [Computer] Repeat the calculations of Example 5.3 (page 185) with all the same parameters, 
but with the initial conditions x 0 = 2 and v o — 0. Plot x{t) for 0 < t < 4 and compare with the plot of 
Example 5.3. Explain the similarities and differences. 

5.37 ** [Computer] Repeat the calculations of Example 5.3 (page 185) but with the following param¬ 
eters 


co = 2n, co 0 = 0.25 co, fi = 0.2oj o , f Q = 1000 

and with the initial conditions x 0 = 0 and u 0 = 0. Plot x(t) for 0 < t < 12 and compare with the plot 
of Example 5.3. Explain the similarities and differences. (It will help your explanation if you plot the 
homogeneous solution as well as the complete solution — homogeneous plus particular.) 

5.38 ** [Computer] Repeat the calculations of Example 5.3 (page 185) but take the parameters of the 
system to be co = co 0 = 1, /J = 0.1, and f Q = 0.4, with the initial conditions x 0 = 0 and v Q = 6 (all in 
some apppropriate units). Find A and S, and then B x and B 2 , and make a plot of x(t) for the first ten or 
so periods. 
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5.39** [Computer] To get some practice at solving differential equations numerically, repeat the 
calculations of Example 5.3 (page 185), but instead of finding all the various coefficients just use 
appropriate software (for example, the NDSolve command of Mathematical to solve the differential 
equation (5.48) with the boundary conditions x 0 = v 0 = 0. Make sure your graph agrees with Figure 


section 5.6 Resonance 

5.40 ★ Consider a damped oscillator, with fixed natural frequency co 0 and fixed damping constant f 
(not too large), that is driven by a sinusoidal force with variable frequency co. Show that the amplitude 
of the response, as given by (5.71) is maximum when co = Jco 2 — 2f5 2 . (Note that so long as the 
resonance is narrow this implies co oj 0 .) 

5.41 * We know that if the driving frequency co is varied, the maximum response (A 2 ) of a driven 
damped oscillator occurs at co « co 0 (if the natural frequency is co 0 and the damping constant f <K 
co 0 ). Show that A 2 is equal to half its maximum value when co ~ co Q ± /?, so that the full width at 
half maximum is just 2/5. [Hint: Be careful with your approximations. For instance, it’s fine to say 
co + co 0 & 2co 0 , but you certainly mustn’t say co — co Q & 0.] 

5.42 * A large Foucault pendulum such as hangs in many science museums can swing for many hours 
before it damps out. Taking the decay time to be about 8 hours and the length to be 30 meters, find the 
quality factor Q. 

5.43 ** When a car drives along a “washboard” road, the regular bumps cause the wheels to oscillate 
on the springs. (What actually oscillates is each axle assembly, comprising the axle and its two wheels.) 
Find the speed of my car at which this oscillation resonates, given the following information: (a) When 
four 80-kg men climb into my car, the body sinks by a couple of centimeters. Use this to estimate the 
spring constant k of each of the four springs, (b) If an axle assembly (axle plus two wheels) has total 
mass 50 kg, what is the natural frequency of the assembly oscillating on its two springs? (c) If the 
bumps on a road are 80 cm apart, at about what speed would these oscillations go into resonance? 

5.44 ** Another interpretation of the Q of a resonance comes from the following: Consider the motion 
of a driven damped oscillator after any transients have died out, and suppose that it is being driven close 
to resonance, so you can set co = co 0 . (a) Show that the oscillator’s total energy (kinetic plus potential) 
is E = ±mco 2 A 2 . (b) Show that the energy A£ dis dissipated during one cycle by the damping force 
F dmp is 2jTmfcoA 2 . (Remember that the rate at which a force does work is Fv.) (c) Hence show that 
Q is 2n times the ratio E /A E dis . 

5.45 *** Consider a damped oscillator, with natural frequency co 0 and damping constant fi both fixed, 
that is driven by a force F(t) = F a cos (cot), (a) Find the rate Pit) at which Fit) does work and show 
that the average rate (P) over any number of complete cycles is mfico 2 A 2 . (b) Verify that this is the 
same as the average rate at which energy is lost to the resistive force, (c) Show that as co is varied {P) 
is maximum when co = co 0 ; that is, the resonance of the power occurs at co = co Q (exactly). 

section 5.7 Fourier Series * 

5.46 * The constant term a 0 in a Fourier series is a bit of a nuisance, always requiring slightly special 
treatment. At least it has a rather simple interpretation: Show that if fit) has the standard Fourier series 
(5.82), then a 0 is equal to the average (/) of fit) taken over one complete cycle. 
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5.47 ★* In order to prove the crucial formulas (5.83)Q5.85) for the Fourier coefficients a n and b n , you 
must first prove the following: 

/ cos(no)t) cos(mo)t) dt = \ X J^ (5.105) 

-r/2 10 if m^n. 

(This integral is obviously r if m = n = 0.) There is an identical result with all cosines replaced by 
sines, and finally 

rr/2 

I cos (ncot) sin {moot) dt = 0 for all integers n and m, (5.106) 

J — t /2 

where as usual to — 2n/z. Prove these. [Hint: Use trig identities to replace cos (6) cos(0) by terms like 
cos (0 + 0) and so on.] 

5.48 ** Use the results (5.105) and (5.106) to prove the formulas (5.83)-(5.85) for the Fourier coeffi¬ 
cients a n and b n . [Hint: Multiply both sides of the Fourier expansion (5.82) by cos (ma>t) or sin {moot) 
and then integrate from —r/2 to r/2.] 

5.49 [Computer] Find the Fourier coefficients a n and b n for the function shown in Figure 5.28(a). 
Make a plot similar to Figure 5.23, comparing the function itself with the first couple of terms in the 
Fourier series, and another for the first six or so terms. Take / max = 1. 



(a) (b) 

Figure 5.28 (a) Problem 5.49. (b) Problem 5.50 


5.50 *★* [Computer] Find the Fourier coefficients a n and b n for the function shown in Figure 5.28(b). 
Make a plot similar to Figure 5.23, comparing the function itself with the sum of the first couple of 
terms in the Fourier series, and another for the first 10 or so terms. Take / max = 1. 

section 5.8 Fourier Series Solution for the Driven Oscillator* 

5.51 ★* You can make the Fourier series solution for a periodically driven oscillator a bit tidier if you 
don’t mind using complex numbers. Obviously the periodic force of Equation (5.90) can be written as 
/ = Re(g), where the complex function g is 

g(o = Yl^ neina>t - 

n =0 

Show that the real solution for the oscillator’s motion can likewise be written as x = Re(z), where 
z(t) = ^ C n e ,nu>t 

72=0 
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and 


” co 2 — n 2 (o 2 + 2i/3nco 

This solution avoids our having to worry about the real amplitude A n and phase shift 8 n separately. (Of 
course A n and 8 n are hidden inside the complex number C n .) 

5.52 *** [Computer] Repeat all the calculations and plots of Example 5.5 (page 199) with all the same 
parameters except that /3 =0.1. Compare your results with those of the example. 

5.53*** [Computer] An oscillator is driven by the periodic force of Problem 5.49 [Figure 5.28(a)], 
which has period r = 2. (a) Find the long-term motion x(t), assuming the following parameters: natural 
period r 0 = 2 (that is, co 0 = n), damping parameter ^ = 0.1, and maximum drive strength / max = 1. 
Find the coefficients in the Fourier series for x(t) and plot the sum of the first four terms in the series 
for 0 < t < 6. (b) Repeat, except with natural period equal to 3. 

section 5.9 The RMS Displacement; Parseval’s Theorem * 

5.54 * Let f(t) be a periodic function with period r. Explain clearly why the average of / over one 
period is not necessarily the same as the average over some other time interval. Explain why, on the 
other hand, the average over a long time T approaches the average over one period, as T -* oo. 

5.55 ** To prove the Parseval relation (5.100), one must first prove the result (5.99) for the integral of 
a product of cosines. Prove this result, and then use it to prove the Parseval relation. 

5.56** The Parseval relation as stated in (5.100) applies to a function whose Fourier series happens 
to contain only cosines. Write down the relation and prove it for a function 

x(t) = ^ [ A n cos {ncot — 8 n ) + B n sin (ncot — 8 n ) ]. 

n =o 

5.57 ** [Computer] Repeat the calculations that led to Figure 5.26, using all the same parameters 
except taking fi =0.1. Plot your results and compare your plot with Figure 5.26. 
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Calculus of Variations 


In many problems one needs to use non-Cartesian coordinates. Roughly speaking there 
are two classes of such problems. First, certain symmetries make it most advantageous 
to use special coordinates: Problems with spherical symmetry call out for the use of 
spherical polar coordinates; similarly, problems with axial symmetry are best treated 
in cylindrical polar coordinates. Second, when particles are constrained in some way, 
it is usually best to choose an appropriate, and usually non-Cartesian, coordinate 
system. For example, an object that is constrained to move on the surface of a sphere 
is probably best treated using spherical polar coordinates; if a bead slides on a curved 
wire, the best choice of coordinate may be just the distance along the curving wire 
from some convenient origin. 

Unfortunately, as we have seen, the expressions for the components of the accel¬ 
eration in non-Cartesian coordinates are quite messy, and the situation gets rapidly 
worse as we move on to more complicated systems. This makes Newton’s second 
law difficult to use in non-Cartesian coordinates. We need an alternative (though ul¬ 
timately equivalent) equation of motion that works equally well in any coordinates, 
and the required alternative is provided by Lagrange’s equations. 

The best way to prove — and to understand the great flexibility of—Lagrange’s 
equations is to use a “variational principle.” Variational principles are important in 
many areas of mathematics and physics. It has proved possible to formulate almost 
every branch of physics — classical mechanics, quantum mechanics, optics, electro¬ 
magnetism, and so on — in variational terms. To the beginning student, accustomed 
to Newton’s laws, a reformulation of classical mechanics in terms of a variational 
principle does not necessarily seem like an improvement. But because they allow a 
similar formulation of so many different subjects, variational methods have given a 
unity to physics and have played a crucial role in the recent history of physical theory. 
For this reason, I would like to introduce variational methods in a reasonably general 
setting. Therefore this short chapter is a brief introduction to variational problems in 
general. In the next chapter I shall apply what we learn here to establish the Lagrangian 
formulation of mechanics. If you are already familiar with the “calculus of variations” 
you could skip straight to Chapter 7. 


215 




216 


Chapter 6 Calculus of Variations 

6.1 Two Examples 


The calculus of variations involves finding the minimum or maximum of a quantity 
that is expressible as an integral. To see how this can arise, I would like to start with 
two simple, concrete examples. 


The Shortest Path between Two Points 

My first example is this problem: Given two points in a plane, what is the shortest 
path between them? While you certainly know the answer — a straight line — you 
probably have not seen a proof, unless you have studied the calculus of variations. 
The problem is illustrated in Figure 6.1, which shows the two given points, (x t , _y,) 
and ( x 2 , 3 ^ 2 ), and a path, y = y(x), joining them. Our task is to find the path y(x) that 
has the shortest length and to show that it is in fact a straight line. 

The length of a short segment of the path is ds = yjdx 2 + dy 2 , which, since 

dy^^dx = y'(x)dx, 
dx 

we can rewrite as 


ds — y/ dx 2 + dy 2 — y/l + y'(x) 2 dx. (6.1) 

Thus the total length of the path between points 1 and 2 is 

L — ds — J y /1 T y'(x) 2 dx. ( 6 . 2 ) 

This equation puts our problem in mathematical form: The unknown is the function 
y = y(x) that defines the path between points 1 and 2. The problem is to find the 
function y (x) for which the integral (6.2) is a minimum. It is interesting to contrast this 
with the standard minimization problem of elementary calculus, where the unknown 



Figure 6.1 A path joining the two points 1 and 2. The length of 
the short segment is ds = yjdx 2 + dy 2 , and the total length of 
the path is L — ds. 
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is the value of a variable x at which a known function fix) is a minimum. Obviously 
our new problem is one stage more complicated than this old one. 

Before we set up the machinery to solve this new problem, let’s consider another 
example. 


Fermat’s Principle 


A similar problem is to find the path that light will follow between two points. If the 
refractive index of the medium is constant, then the path is, of course, a straight line, 
but if the refractive index varies, or if we interpose a mirror or lens, the path is not so 
obvious. The French mathematician Fermat (1601-1665) discovered that the required 
path is the path for which the time of travel of the light is minimum. We can illustrate 
Fermat’s principle using Figure 6.1. The time for light to travel a short distance ds 
is ds/v where v denotes the speed of light in the medium, v = c/n where n is the 
refractive index. Thus Fermat’s principle says that the correct path between points 1 
and 2 is the path for which the time 


(time of travel) m 



ds 

v 


1 

c 


n ds 


is a minimum. If n is constant, then it can be taken outside the integral and the problem 
reduces to finding the shortest path between points 1 and 2 (and the answer is, of 
course, a straight line). In general, the refractive index can vary, n — n(x, y), and our 
problem is to find the path y(x) for which the integral 


f n(x, y)y/l + y'(x) 2 dx (6.3) 

is minimum. [In writing the last expression, I substituted (6.1) for ds.] 

The integral that has to be minimized in connection with Fermat’s principle is 
very similar to the integral (6.2) giving the length of a path; it is just a little more 
complicated, since the factor n(x,y ) introduces an extra dependence on x and y. 
Similar integrals arise in many other problems. Sometimes we want the path for 
which an integral is a maximum, and sometimes we are interested in both maxima 
and minima. To get some idea of the possibilities, it is helpful to think again about the 
problem of finding maxima and minima of functions in elementary calculus. There we 
know that the necesary condition for a maximum or minimum of a function f(x) is 
that its derivative vanish, df/dx = 0. Unfortunately, this condition is not quite enough 
to guarantee a maximum or minimum. As you certainly recall from introductory 
calculus, there are essentially three possibilities, as illustrated in Figure 6.2. A point 
x 0 where df/dx is zero may be a maximum or a minimum or, if d 2 f/dx 2 is also zero, 
it may be neither, as indicated in Figure 6.2(c). When df/dx = 0 at a point x 0 , but 
we don’t know which of the three possibilities obtains, we say that is a stationary 
point of the function f(x), since an infinitesimal displacement of x from x 0 leaves 
f{x) unchanged (because the slope is zero). 

The situation for the problems of this chapter is very similar. The method I shall 
describe in the next section actually finds the path that makes an integral like (6.2) or 
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x 0 x 0 x 0 


X -> 

(a) (b) (c) 

Figure 6.2 If df/dx = 0 at x 0 , there are three possibilities: (a) If the second 
derivative is positive, then f(x ) has a minimum at x 0 . (b) If the second 
derivative is negative, then /(x) has a maximum, (c) If the second derivative 
is zero, then there may be a minimum, a maximum, or neither (as shown). 


(6.3) stationary, in the sense that an infinitesimal variation of the path from its correct 
course doesn’t change the value of the integral concerned. If you need to know that the 
integral is definitely minimum (or definitely maximum, or perhaps neither), you have 
to check this separately. Incidentally, we are now ready to explain the name of this 
chapter: Since our concern is how infinitesimal variations of a path change an integral, 
the subject is called the calculus of variations. For the same reason, the methods we 
shall develop are called variational methods, and a principle like Fermat’s principle 
is a variational principle. 


6.2 The Euier-Lagrange Equation 


The two examples of the last section illustrate the general form of the so-called 
variational problem. We have an integral of the form 


$= f f[y(x),y'(x),x}dx 
Jx i 


(6.4) 


where y(x) is an as-yet un k nown curve joining two points (x l5 yj) and (x 2 , y 2 ) as in 
Figure 6.1; that is. 


y(*i) = yi and y{x 2 ) = y 2 . (6.5) 

Among all the possible curves satisfying (6.5) (that is, joining the points 1 and 2), 
we have to find the one that makes the integral S a minimum (or maximum or at least 
stationary). To be definite, I shall suppose that we wish to find a minimum. Notice that 
the function / in (6.4) is a function of three variables / = f(y, /, x), but because 
the integral follows the path y = y(x) the integrand /[y(x), y'(x), x] is actually a 
function of just the one variable x. 
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*1 x 2 


Figure 6.3 The path y = y(x) between points 1 and 2 is the “right” 
path, the one for which the integral S of (6.4) is a minimum. Any other 
path F (x) is “wrong,” in that it gives a larger value for S. 


Let us denote the correct solution to our problem by y = y(x). Then the integral 
S in (6.4) evaluated for y — y(x) is less than for any neighboring curve y = F(x), as 
sketched in Figure 6.3. It is convenient to write the “wrong” curve Y (x) as 

Y(x) = y(x) + rj(x) (6.6) 

where rj(x) (Greek “eta”) is just the difference between the wrong Y (x) and the right 
y(x). Since Y (x) must pass through the endpoints 1 and 2, rj(x) must satisfy 

= ? 7 (x 2 )'=(). (6.7) 

There are infinitely many choices for the difference ??(x); for example, we could 
choose T) = (x — xj)(x 2 — x) or r](x) = sin[7r(x — Xj)/(x 2 — *i)]. 

The integral S taken along the wrong curve F(x) must be larger than that along 
the right curve y(x), no matter how close the former is to the latter. To express this 
requirement, I shall introduce a parameter a and redefine F(x) to be 

F (x) = y(x) + arj(x). (6.8) 


The integral S taken along the curve F(x) now depends on the parameter a, so I shall 
call it S(a). The right curve y(x) is obtained from (6.8) by setting a = 0. Thus the 
requirement that S is minimum for the right curve y (x) implies that S(a) is a minimum 
at a = 0. With this result, we have converted our problem to the traditional problem 
from elementary calculus of making sure that an ordinary function [namely S(a)] has 
a minimum at a specified point (a = 0). To ensure this, we must just check that the 
derivative dS/da is zero when a = 0. 

If we write out the integral S(a) in detail, it looks like this: 
r x 2 

S(a)= / f(Y,Y',x)dx 



+ or r],y' + ai}',x)dx. 


(6-9) 
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To differentiate (6.9) with respect to a, we note that a appears in the integrand /, so 
we need to evaluate df/da. Since a appears in two of the arguments of /, this gives 
two terms, namely (using the chain rule) 

df(y +arj,y' + arj / ,x) __ df ,df 
da ^ dy ^ dy' ’ 

and for dS/da (which has to be zero) 


dS_ 

da 


rK dx= rLK +t( !L) dx=0 . 

Jx x da J Xl V dy dy'J 


( 6 . 10 ) 


This condition must be true for any rj(x ) satisfying (6.7); that is, for any choice of the 
“wrong” path y(x) = y(x) + arj(x). 

To take advantage of the condition (6.10), we need to rewrite the second term on 
the right using integration by parts 1 (remember that rj' means drj/dx): 


n'(x)^f-dx 

dy' 




Because of the condition (6.7), the first term on the right (the “endpoint term”) is zero. 
Thus 2 


«>•»> 

Jxi 3 y Jx { dx \ dy'J 


Substituting this identity into (6.10), we find that 

iw) dx=0 - (612) 

This condition must be satisfied for any choice of the function r](x). Therefore, as I 
shall argue in a moment, the factor in large parentheses must be zero: 


~ ~ —— =0 (Euler-Lagrange Equation) (6.13) 

dy dx dy' 


for all x (in the relevant interval x l < x < x 2 ). This is the so-called Euler-Lagrange 
equation (named for the Swiss mathematician Leonhard Euler, 1707-1783, and the 
Italian-French physicist and mathematician Joseph Lagrange, 1736-1813), which lets 


1 If you are used to thinking of integration by parts in the form f v du = [uv] — f udv , then you 
will find it helpful to recognize that another way to say the same thing is: / u'vdx = [uv] — f uv'dx. 
In words: In the integral f u'v dx, you can move the prime from the u to the v if you change the 
sign and add the endpoint contribution [uv\ 

2 This is the simple form in which integration by parts often appears in physics: Provided the 
endpoint term [nn] is zero (as often happens), integration by parts lets you move the differentiation 
from the u to the v as long as you change the sign; that is, / u'vdx = —f uv’dx. 
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us find the path for which the integral S is stationary. Before I illustrate its use, I need 
to discuss the step from (6.12) to (6.13), which is by no means obvious. 

Equation (6.12) has the form f r](x)g(x) dx — 0.1 would certainly not claim that 
this condition alone implies that g(x) = 0 for all x. However, (6.12) holds for any 
choice of the function r/(x), and if f rj(x)g(x)dx = 0 for any rj(x), then we can 
conclude that g(x) —0 for all x. To prove this, we must assume that all functions 
concerned are continuous, but, as physicists, we would take for granted that this is 
the case. 3 Now, to prove the assertion, let us assume the contrary, that g(x) is nonzero 
in some interval between x ] and x 2 . Then choose a function rj(x) that has the same 
sign as g(x) (that is, t] is positive where g is positive and r] is negative where g is 
negative). Then the integrand is continuous, satisfies r](x)g(x) > 0, and is nonzero 
at least in some interval. Under these conditions / rj(x)g(x) dx cannot be zero. This 
contradiction implies that g(x) is zero for all x. 

This completes the proof of the Euler-Lagrange equation. The procedure for using 
it is this: (1) Set up the problem so that the quantity whose stationary path you seek 
is expressed as an integral in the standard form 

S= f 2 f[y(x),y'(x),x]dx, (6.14) 

Jx X 

where f[y(x), /(x), x] is the function appropriate to your problem. (2) Write down 
the Euler-Lagrange equation (6.13) in terms of the function f[y(x), /(x), xj. (3) 
Finally, solve (if possible) the differential equation (6.13) for the function y(x) that 
defines the required stationary path. I shall illustrate this procedure with a couple of 
examples in the next section. 


6.3 Applications of the Euler-Lagrange Equation 


Let us start with the problem that began this chapter, finding the shortest path between 
two points in a plane. 


example 6.1 Shortest Path between Two Points 

We saw that the length of a path between points 1 and 2 is given by the integral 
(6.2) as 

L = J^ds = jf Vl + y /2 dx. 

This has the standard form (6.14), with the function / given by 

/(y,/,x) = (l + / 2 ) 1/2 . (6.15) 


3 The claimed result is clearly false if discontinuous functions are admitted. For instance, if we 
made g(x) nonzero at just one point, then f r](x)g(x) dx would still be zero. 
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To use the Euler-Lagrange equation (6.13), we must evaluate the two partial 
derivatives concerned: 


V = 0 and V. 


(6.16) 


3y 3 y' (1 + y' 2 ) 1 / 1 

Since df/dy = 0, (6.13) implies simply that 

±v =0 

dx 3 y' 

In other words, df/dy' is a constant, C. According to (6.16), this implies that 
y a = C 2 (l + y' 2 ), 


or, with a little rearrangement, y a = constant. This implies that y'(x) is a 
constant, which we could call m. Integrating the equation y\x ) = m, we find 
that y(x) = mx + b, and we have proved that the shortest path between two 
points is a straight line! 


A Note on Variables 

So far we have considered problems with two variables, which we have called x and 
y. Of these, x has been the independent variable, and y the dependent, through the 
relation y = y(x). Unfortunately, we are frequently forced — by convenience or tra¬ 
dition — to name the variables differently. For example, in a simple one-dimensional 
mechanics problem, the independent variable is the time t and the dependent variable 
is the position x = x-(t). This means you will have to get used to seeing the Euler- 
Lagrange equation with the variables x and y replaced by an assortment of other 
variables, such as t and x. In the next example, the two variables are x and y, but the 
independent variable is y, and the roles of x and y in (6.13) and (6.14) will be exactly 
reversed. 


| example 6.2 The Brachistochrone 

A famous problem in the calculus of variations is this: Given two points 1 and 
I 2, with 1 higher above the ground, in what shape should we build a frictionless 
| roller coaster track so that a car released from point 1 will reach point 2 in 
the shortest possible time? This problem is called the brachistochrone problem, 
from the Greek words brachistos meaning “shortest” and chronos meaning 
“time.” The geometry of the problem is sketched in Figure 6.4, where I have 
taken point 1 as the origin and I have chosen to measure y vertically down. 

The time to travel from 1 to 2 is 

f 2 ds 

time(l —> 2) = / — (6.17) 

J 1 v 

where the speed at any height y is determined by conservation of energy to be 
v = ■sj'lgy. (Problem 6.8.) Because this gives v as a function of y, it is convenient 
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Figure 6.4 The brachistochrone problem is to find the shape of 
track on which a roller coaster released from point 1 will reach 
point 2 in the minimum possible time. 



to take y as our independent variable. That is, we shall write the unknown path 
as x = x (y). This means that the distance ds between neighboring points on the 
path has to be written as 

ds = y/dx 2 + dy 2 = y/x'(y) 2 + 1 dy (6.18) 


where a prime now denotes differentiation with respect to y; that is, x'(y) = 
dx/dy. Thus according to (6.17) the time of interest is 


time(l -> 2) 




! yVQO 2 +1 

V7 


dy. 


(6.19) 


Equation (6.19) gives the integral whose minimum we have to find. It is of 
the standard form (6.14), except that the roles of x and y have been interchanged, 
with the integrand 


f(x,x', y) 



( 6 . 20 ) 


To find the path that makes the time as small as possible, we have only to apply 
the Euler-Lagrange equation (again with x and y interchanged) to this function, 


df = d_d£ 
dx dy dx' 


( 6 . 21 ) 


The function of (6.20) is independent of x, so the derivative df/dx is zero, and 
(6.21) tells us simply that df/dx' is a constant. Evaluating this derivative (and 
squaring it for convenience) we conclude that 


y(l + x' 2 ) 


— const = — 
2 a 


( 6 . 22 ) 


j 

i 


where I have named the constant l/2a for future convenience. This equation is 
easily solved for x' to give 
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whence 

v ^/\5r'7'' v ' (6 ' 23) 

This integral can be evaluated by the unlikely looking substitution 

y = a(l — cos#) (6.24) 

which gives (as you should check) 

(1 - cos 0)d6 

— a(6 - sin #) + const. (6.25) 

The two equations (6.25) and (6.24) are parametric equations for the required 
path, giving x and y as functions of the parameter #. We have chosen the initial 
point 1 to have x = y — 0, so we see from (6.24) that the initial value of # is 
zero. This in turn implies that the constant of integration in (6.25) is zero. Thus 
the final parametric equation for the path is 

x = a(0 — sin#) and y = a{ 1 — cos#) (6.26) 

with the constant a chosen so the curve passes through the given point (x 2 , >’2). 

The curve (6.26) is plotted in Figure 6.5. In that figure I have continued 
the curve (with dashes) beyond the point 2 to show that the curve that solves 
the brachistochrone problem happens to be a cycloid — the curve traced out by 
a point on the rim of a wheel of radius a, rolling along the underside of the 
x axis (Problem 6.14). Another remarkable feature of this curve is this: If we 
release the cart from rest at point 2 and let it roll to the bottom of the curve 
(point 3 in the figure), the time to roll from 2 to 3 is the same whatever the 
position of 2, anywhere between 1 and 3. This means that the oscillations of a 
cart rolling back and forth on a cycloid-shaped track are exactly isochronous 
(period perfectly independent of amplitude), in contrast with the oscillations of 
a simple pendulum, which are only approximately isochronous, to the extent 




Figure 6.5 The path for a roller coaster that gives the shortest 
time between the given points 1 and 2 is part of the cycloid with a 
vertex at 1 and passing through 2. The cycloid is the curve traced 
by a point on the rim of a wheel of radius a that rolls along the 
underside of the x axis. Point 3 is the lowest point on the curve. 
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that the amplitude is small. (See Problem 6.25.) The isochronous property of 
the cycloid was actually used in the design of some clocks, one of which can be 
seen in the Victoria and Albert Museum in London. 


Maximum and Minimum vs. Stationary 

You have probably noticed that in neither example of this section did I check that the 
curves that we found actually gave a minimum value to the integral of interest — that 
the straight line between two points actually makes the path length minimum, not a 
maximum or just stationary. The Euler-Lagrange equation guarantees only to give a 
path for which the original integral is stationary. The problem of deciding whether we 
have a minimum or maximum (or a stationary curve that is neither) is generally very 
difficult. In a few cases, it is easy to see which is the case. For instance, it really is 
obvious that a straight line gives the minimum distance between two points in a plane. 
In the case of the brachistochrone, it is not at all obvious that the path we found does 
yield a minimum time, though it is in fact true. 

To illustrate the variety of possibilities, consider the problem of finding the shortest 
path, or geodesic, between two points 1 and 2 on the surface of a globe. As you 
probably know, the answer is the great circle joining the two points. 4 Using the calculus 
of variations you can prove relatively easily that a great circle does indeed make the 
distance stationary: Using spherical polar coordinates, every point on the globe can 
be identified by the two angles 6 and 0. If you characterize a path as <p = 0 (6) and set 
up an integral that gives the distance between 1 and 2 along this path, you can show 
that the Euler-Lagrange equation for 0 (6) requires that the path follow a great circle. 
(See Problem 6.16 for details.) But you have to think a little carefully before deciding 
that this necessarily gives a minimum distance, since there are two different great- 
circle paths connecting any two points 1 and 2 on the globe: For simplicity consider 
two towns on the equator, Quito (near the Pacific coast of Ecuador) and Macapa (at 
the mouth of the Amazon on the Atlantic coast of Brazil). The “right” shortest path 
between these two is, of course, the great-circle path following the equator for about 
2000 miles across South America. But a second possibility, which satisfies the Euler- 
Lagrange equation just as well, is to head west around the equator from Quito, across 
the Pacific, the African continent, and the Atlantic, arriving in Macapa some 22,000 
miles later. You might guess that this path would be a maximum, but it is in fact neither 
maximum nor minimum: It is easy to construct nearby paths that are shorter, but it is 
also easy to find others that are longer. In other words, this second great-circle path 
gives neither a maximum nor a minimum. This second path is, of course, analogous 
to the horizontal point of inflection in elementary calculus. In this problem, luckily, 
it is obvious that the first path gives the true minimum. However, it should be clear 
that, in general, deciding what sort of stationary path the Euler-Lagrange equation 
has given us can be tricky. 


A great circle is the circle in which the globe intersects a plane through the globe’s center. 
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Fortunately for us, these questions are irrelevant for our purposes. We shall find that 
for the applications in mechanics all that matters is that we have a path which makes 
a certain integral stationary. It simply doesn’t matter whether it gives a maximum, 
minimum, or neither. 


6.4 More than Two Variables 


So far we have considered only problems with just two variables, the independent 
variable (usually *) and the dependent (usually y). For most applications in mechanics, 
we shall find that there are several dependent variables, though fortunately still only 
one independent variable, which is usually the time t. For a simple example where 
there are two dependent variables, we can go back to the problem of the shortest path 
between two points. When we found the shortest path between two points 1 and 2, we 
assumed that the required path could be written in the form y = y(x). Reasonable as 
this seems, it is easy to think of paths that cannot be written in this way, such as the 
path shown in Figure 6.6. If we want to be perfectly sure we have found the shortest 
path among all possible paths, we must find a method that includes these. The way to 
do this is to write the path in parametric form as 

x=x{u) and y = y(u), (6.27) 

where u is any convenient variable in terms of which the curve can be parameterized 
(for instance, the distance along the path). The parametric form (6.27) includes all 
of the curves considered before. [If y = y(x), just use x for the parameter u.] It also 
includes curves like that of Figure 6.6 and, in fact, all curves of interest. 5 

The length of a small segment of the path (6.27) is 

ds = y/dx 2 + dy 2 = \/x'(u) 2 + y'{u) 2 du (6.28) 

where, as usual, a prime denotes differentiation with respect to the function’s argu¬ 
ment; that is, x'(u) = dx/du and y'(u) = dy/du. Thus the total path length is 

p* 2 , - 

L = / yjx’{u) 2 -f y'(u) 2 du, (6.29) 

and our job is to find the two functions x{u) and y(u) for which this integral is 
minimum. 

This problem is more complicated than any we have considered before, because 


5 In case you are interested in mathematical niceties, I should say that in what follows I shall 
assume that all functions concerned are continuous and have continuous second derivatives. This 
assumption could be weakened a little, for example, by allowing some discontinuities in the 
derivatives. 
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Figure 6.6 This path between the two points 1 and 2 cannot be 
written as y = y(x) nor as x = x(y). It can be written in the 
parametric form (6.27). 


there are now two unknown functions x (u) and y(u) . The general problem of this type 
is this: Given an integral of the form 

r u 2 

S— I f[x(u),y(u),x'(u),y\u),u]du (6.30) 

Jui 

between two fixed points [jc(u } ), yiu^] and [x(u 2 ), y(w 2 )j, find the path [x(u), y(w)] 
for which the integral S is stationary. The solution to this problem is very similar to 
the one-variable case, and I shall just sketch it, leaving you to fill in the details. The 
upshot is that with two dependent variables, we get two Euler-Lagrange equations. 
To prove this, we proceed very much as before. Let the correct path be given by 

x = x(u) and y = y(u), (6.31) 


and then consider a neighboring “wrong” path of the form 

x = x(u) + a£(«) and y = y(u) + pri(u) (6.32) 


(where q is the Greek letter “xi”). The requirement that the integral S be stationary for 
the right path (6.31) is equivalent to the requirement that the integral S(a, fi), taken 
along the wrong path (6.32), satisfy 


dS A , dS . 
— =0 and — =0 
da dfi 


(6.33) 


when a = (} = 0. These two conditions are the natural generalization of the condition 
(6.10) for the one-variable case. By an argument which exactly parallels that leading 
from (6.10) to (6.13), you can show that these two conditions are equivalent to the 
two Euler-Lagrange equations (see Problem 6.26): 


df _ d_ 9/ 
dx du dx' 


df = d_d£ 

dy du dy' 
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and 


(6.34) 
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These two equations determine a path for which the integral (6.30) is stationary, and, 
conversely, if the integral is stationary for some path, that path must satisfy these two 
equations. 


example 6.3 The Shortest Path between Two Points Again 

We can now solve completely the problem of the shortest path between two 
points. (That is, solve it including all possible paths, such as that in Figure 6.6.) 
From (6.29), we see that for this problem the integrand / is 


f(x, x', y, y', u ) - \Jx' 2 + y a . 


(6.35) 


Since this is independent of x and y, the two derivatives df/dx and df/dy on 
the left sides in (6.34) are zero. Therefore, the two Euler-Lagrange equations 
imply simply that the two derivatives df /dx’ and df/dy' are constants, 


df _ x' 
dx' y /x a + y' 


= C, 


and 


df_ _ / 

dy' yjx' 2 + y a 


= C 2 . (6.36) 


If we divide the second equation by the first and recognize that y'/x' is just the 
derivative dy/dx, we conclude that 


ii = y- = £i =m , 

dx x' C. 


(6.37) 


say. It follows that the required path is a straight line, y = mx + b. It is interest¬ 
ing that this proof using a parametric equation is not only better than our previous 
proof (in that the new proof includes all possible paths), it is also marginally 


The generalization of the Euler-Lagrange equation to an arbitrary number of 
dependent variables is straightforward, and doesn’t need to be spelled out in detail. 
Here I would just like to sketch the way the Euler-Lagrange equations will appear in 
the Lagrangian formulation of mechanics. 

The independent variable in Lagrangian mechanics is the time t. The dependent 
variables are the coordinates that specify the position, or “configuration,” of a system, 
and are usually denoted by q x , q 2 , ■ ■ ■ , q n . The number n of coordinates depends on the 
nature of the system. For a single particle moving unconstrained in three dimensions, 
n is 3, and the three coordinates q x , q 2 , q 3 could be just the three Cartesian coordinates 
x , y , z, or they might be the spherical polar coordinates r,0,<p. For N particles moving 
freely in three dimensions, n is 3N and the coordinates q h ■ • •, q n could be the 3 N 
Cartesian coordinates x x , y b z b • ••, x N , y N , z N . For a double pendulum (two simple 
pendulums, with the second suspended from the bob of the first, as in Figure 6.7), 
there would be two coordinates q h q 2 , which could be chosen to be the two angles 
shown in Figure 6.7. Because the coordinates q h - ■ ■ ,q n can take on so many guises, 
they are often referred to as generalized coordinates. It is often useful to think of 
the n generalized coordinates as defining a point in an n-dimensional configuration 
space, each of whose points labels a unique position, or configuration, of the system. 
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Figure 6.7 A good choice of generalized coordinates to identify 
the position of a double pendulum is the pair of angles 0, and d 2 
between the pendulums and the vertical. 


The ultimate goal in most problems in Lagrangian mechanics is to find how the 
coordinates vary with time; that is, to find the n functions q\(t), •••, q n (t). One can 
regard these n functions as defining a path in the ^-dimensional configuration space. 
This path is, of course, determined by Newton’s second law, but we shall find that it 
can, equivalently, be characterized as the path for which a certain integral is stationary. 
This means that it must satisfy the corresponding Euler-Lagrange equations (called 
just Lagrange equations in this context), and it turns out that these Lagrange equations 
are usually much easier to write down and use than Newton’s second law. In particular, 
unlike Newton’s second law, Lagrange’s equations take exactly the same simple form 
in all coordinate systems. 

The integral S whose stationary value determines the evolution of the mechanical 
system is called the action integral. Its integrand is called the Lagrangian £ and 
depends on the n coordinates q h q 2 , ■ ■ ■, q n , their n time derivatives q h q 2 , ■ ■ ■, q n 
and the time t. 


£ =L(q h q h ■ ■ ■, q„, q n ,t). (6.38) 

Notice that since the independent variable is t, the derivatives of the coordinates q { 
are time derivatives and are denoted, as usual, with dots as q { . The requirement that 
the action integral 


5 = 



q h ■ ■ ■, q n , q n , t) dt 


be stationary implies n Euler-Lagrange equations 


(6.39) 


a£_rf_a£ 9£_^_a£ and a£ 

dq x dt dq x ' dq 2 dt dq 2 ’ dq n 

These n equations correspond precisely to the two Euler-Lagrange equations in (6.34) 
and are proved in exactly the same way. If these n equations are satisfied, then the 
action integral (6.39) is stationary; and if the action integral is stationary, then these n 
equations are satisfied. In the next chapter, you will see where these equations come 
from and how to use them. 


(6.40) 

dt dq n 
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Principal Definitions and Equations of Chapter 6 

The Euler-Lagrange Equation 

An integral of the form 


s= /' 


f[y(x),y'(x),x]dx 


[Eq. (6.4)] 


taken along a path y = y(x) is stationary with respect to variations of that path if and 
only if y(x) satisfies the Euler-Lagrange equation 


dy 


d df 


dx dy' 


[Eq. (6.13)] 


Several Variables 


If there are n dependent variables in the original integral, there are n Euler-Lagrange 
equations. For instance, an integral of the form 




f[x(u ), y{u), x'(u), y'(u), u]du. 


with two dependent variables [x(u) and y(u)], is stationary with respect to variations 
of x(u) and y(u) if and only if these two functions satisfy the two equations 


*L = ±*L and 
dx du dx' 


Bf ; 


d_df_ 
du dy' 


[Eq. (6.34)] 


Problems for Chapter 6 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (★★★). 

section 6.i Two Examples 

6.1 * The shortest path between two points on a curved surface, such as the surface of a sphere, is 
called a geodesic. To find a geodesic, one has first to set up an integral that gives the length of a path on 
the surface in question. This will always be similar to the integral (6.2) but may be more complicated 
(depending on the nature of the surface) and may involve different coordinates than x and y. To illustrate 
this, use spherical polar coordinates (r, 0, (f>) to show that the length of a path joining two points on a 
sphere of radius R is 


L = R 



1 + sin 2 9 (f>'(6) 2 dO 


(6.41) 


if (6>j, 4>\) and (0 2 ,4> 2 ) specify the two points and we assume that the path is expressed as <p = </>(0). 
(You will find how to minimize this length in Problem 6.16.) 
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6.2 * Do the same as in Problem 6.1 but find the length L of a path on a cylinder of radius R, using 
cylindrical polar coordinates (p, 0, z). Assume that the path is specified in the form 0 = 0(z). 

6.3 ** Consider a ray of light traveling in a vacuum from point P 1 to P 2 by way of the point Q on a 
plane mirror, as in Figure 6.8. Show that Fermat’s principle implies that, on the actual path followed, Q 
lies in the same vertical plane as P x and P 2 and obeys the law of reflection, that 6 y = 0 2 . [Hints: Let the 
mirror lie in the xz plane, and let P, lie on the y axis at (0, >’,, 0) and P 2 in the xy plane at (x 2 , y 2 ,0). 
Finally let Q = (x, 0, z). Calculate the time for the light to traverse the path P\QP 2 and show that it is 
minimum when Q has z = 0 and satisfies the law of reflection.] 

6.4** A ray of light travels from point Pj in a medium of refractive index « , to P 2 in a medium of 
index n 2 , by way of the point Q on the plane interface between the two media, as in Figure 6.9. Show 
that Fermat’s principle implies that, on the actual path followed, Q lies in the same vertical plane as P, 
and P 2 and obeys Snell’s law, that n { sin 0 X = n 2 sin6> 2 . [Hints: Let the interface be the xz plane, and 
let P| lie on the y axis at (0, h h 0) and P 2 in the x, y plane at (x 2 , —h 2 , 0). Finally let Q = (x, 0, z). 
Calculate the time for the light to traverse the path P\QP 2 and show that it is minimum when Q has 
z = 0 and satisfies Snell’s law.] 

6.5 ** Fermat’s principle is often stated as “the travel time of a ray of light, moving from point A 
to B, is minimum along the actual path.” Strictly speaking it should say that the time is stationary, 
not minimum. In fact one can construct situations for which the time is maximum along the actual 
path. Here is one: Consider the concave, hemispherical mirror shown in Figure 6.10, with A and B 
at opposite ends of a diameter. Consider a ray of light traveling in a vacuum from A to B with one 
reflection at P, in the same vertical plane as A and B. According to the law of reflection, the actual 



Figure 6.9 Problem 6.4 








232 


Chapter 6 Calculus of Variations 


A 



B 


Figure 6.10 Problem 6.5 


path goes via point P Q at the bottom of the hemisphere (0 = 0). Find the time of travel along the path 
APB as a function of 9 and show that it is maximum at P = P 0 . 

6.6 ** In many problems in the calculus of variations, you need to know the length ds of a short segment 
of a curve on a surface, as in the expression (6.1). Make a table giving the appropriate expressions for 
ds in the following eight situations: (a) A curve given by y = y(x ) in a plane, (b) same but x = x(y), 
(c) same but r = r(0), (d) same but 0 = 0(r); (e) curve given by 0 = 0(z) on a cylinder of radius R, 
(f) same but z = z(0); (g) curve given by 6 = 0(0) on a sphere of radius R, (h) same but 0 = 0(0). 

section 6.3 Applications of the Euler-Lagrange Equation 

6.7 * Consider a right circular cylinder of radius R centered on the z axis. Find the equation giving 0 
as a function of z for the geodesic (shortest path) on the cylinder between two points with cylindrical 
polar coordinates ( R , <f> y , z y ) and (R, <j> 2 , z i)- Describe the geodesic. Is it unique? By imagining the 
surface of the cylinder unwrapped and laid out flat, explain why the geodesic has the form it does. 

6.8 * Verify that the speed of the roller coaster car in Example 6.2 (page 222) is y/2gy. (Assume the 
wheels have negligible mass and neglect friction.) 

6.9 * Find the equation of the path joining the origin O to the point P(l, 1) in the xy plane that makes 
the integral (y /2 + yy' + y 2 ) dx stationary. 

6.10* In general the integrand f(y, y', x) whose integral we wish to minimize depends on y, y', and*. 
There is a considerable simplification if / happens to be independent of y, that is, / = f(y', x). (This 
happened in both Examples 6.1 and 6.2, though in the latter the roles of x and y were interchanged.) 
Prove that when this happens, the Euler-Lagrange equation (6.13) reduces to the statement that 

df/dy' = const. (6.42) 

Since this is a first-order differential equation for y(x), while the Euler-Lagrange equation is generally 
second order, this is an important simplification and the result (6.42) is sometimes called a first integral 
of the Euler-Lagrange equation. In Lagrangian mechanics we’ll see that this simplification arises when 
a component of momentum is conserved. 

r x 2 - 

6.11 ** Find and describe the path y = y(x) for which the integral / *Jxy/ 1 + J n dx is stationary. 

Jx\ 

_ 

Xy/l — y' 2 dx is stationary is a sinh 

IUUV.UVH. 1 
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6.13 ** In relativity theory, velocities can be represented by points in a certain “rapidity space” in 
which the distance between two neighboring points is ds = [2/(1 — r 2 )] v / dr 2 + r 2 d<p 2 , where r and 
0 are polar coordinates, and we consider just a two-dimensional space. (An expression like this for 
the distance in a non-Euclidean space is often called the metric of the space.) Use the Euler-Lagrange 
equation to show that the shortest distance from the origin to any other point is a straight line. 

6.14 *★ (a) Prove that the brachistochrone curve (6.26) is indeed a cycloid, that is, the curve traced 
by a point on the circumference of a wheel of radius a rolling along the underside of the x axis, 
(b) Although the cycloid repeats itself indefinitely in a succession of loops, only one loop is relevant 
to the brachistochrone problem. Sketch a single loop for three different values of a (all with the same 
starting point 1) and convince yourself that for any point 2 (with positive coordinates x 2 , y 2 ) there is 
exactly one value of a for which the loop goes through the point 2. (c) To find the value of a for a given 
point x 2 , y 2 usually requires solution of a transcendental equation. Here are two cases where you can 
do it more simply: For x 2 = nb, y 2 = 2b and again for x 2 = 2nb, y 2 = 0 find the value of a for which 
the cycloid goes through the point 2 and find the corresponding minimum times. 

6.15 ** Consider again the brachistochrone problem of Example 6.2 (page 222) but suppose that the 
car is launched from point 1 with a fixed speed v Q . Show that the path of minimum time to the fixed 
point 2 is still a cycloid, but with its cusp (the top point of the curve) a height d 2 /2 g above point 1. 

6.16** Use the result (6.41) of Problem 6.1 to prove that the geodesic (shortest path) between two 
given points on a sphere is a great circle. [Hint: The integrand /(0, 0', 6) in (6.41) is independent of 0, 
so the Euler-Lagrange equation reduces to 3//30' = c, a constant. This gives you 0' as a function of 9. 
You can avoid doing the final integral by the following trick: There is no loss of generality in choosing 
your z axis to pass through the point 1. Show that with this choice the constant c is necessarily zero, 
and describe the corresponding geodesics.] 

6.17 ** Find the geodesics on the cone whose equation in cylindrical polar coordinates is z = Xp. [Let 
the required curve have the form 0 = (pip).] Check your result for the case that X -* 0. 

6.18 ** Show that the shortest path between two given points in a plane is a straight line, using plane 
polar coordinates. 

6.19** A surface of revolution is generated as follows: Two fixed points (jt l5 y,) and (x 2 , y 2 ) in the 
x, y plane are joined by a curve y = y(x). [Actually you’ll make life easier if you start out writing 
this as x = x(y).] The whole curve is now rotated about the x axis to generate a surface. Show that 
the curve for which the area of the surface is minimum has the form y = y Q cosh[(x — x 0 )/y 0 ], where 
x 0 and y 0 are constants. (This is often called the soap-bubble problem, since the resulting surface is 
usually the shape of a soap bubble held by two coaxial rings of radii y, and y 2 .) 

6.20** If you haven’t done it, take a look at Problem 6.10. Here is a second situation in which you 
can find a “first integral” of the Euler-Lagrange equation: Argue that if it happens that the integrand 
/(y, y', x) does not depend explicitly on x, that is, / = /(y, y'), then 

fV = a/ y+ a/ y , 

dx 3 y dy' 

Use the Euler-Lagrange equation to replace 3//3y on the right, and hence show that 
dx dx V 3 y'J 
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This gives you the first integral 

/ — /— = const. (6.43) 

3/ 

This can simplify several calculations. (See Problems 6.21 and 6.22 for examples.) In Lagrangian me¬ 
chanics, where the independent variable is the time t, the corresponding result is that if the Lagrangian 
function is independent of t, then energy is conserved. (See Section 7.8.) 

6.21 ** In Example 6.2 (page 222) we found the brachistochrone by exchanging the variables x and 
y. Here is a method that avoids that exchange: Write the time as in Equation (6.19) but using x as the 
variable of integration. Your integrand should have the form f(y, /, x) = yj{y a + l)/y. Since this is 
independent of x, you can invoke the “first integral” (6.43) of Problem 6.20. Show that this differential 
equation leads you to the same integral for x as in Equation (6.23) and hence to the same curve as 
before. 

6.22 ★** You are given a string of fixed length l with one end fastened at the origin O, and you are to 
place the string in the xy plane with its other end on the x axis in such a way as to enclose the maximum 
area between the string and the x axis. Show that the required shape is a semicircle. The area enclosed 
is of course f ydx, but show that you can rewrite this in the form jj fds, where 5 denotes the distance 
measured along the string from O, where / = yj 1 — y' 2 , and y' denotes dy/ds. Since / does not 
involve the independent variable 5 explicitly, you can exploit the “first integral” (6.43) of Problem 
6.20. 

6.23 An aircraft whose airspeed is v 0 has to fly from town O (at the origin) to town P, which 
is a distance D due east. There is a steady gentle wind shear, such that v wind = Pyx, where x and y 
are measured east and north respectively. Find the path, y = y(x), which the plane should follow to 
minimize its flight time, as follows: (a) Find the plane’s ground speed in terms of v 0 , V, 0 (the angle 
by which the plane heads to the north of east), and the plane’s position, (b) Write down the time of 
flight as an integral of the form / 0 D fdx. Show that if we assume that y' and 0 both remain small (as 
is certainly reasonable if the wind speed is not too large), then the integrand / takes the approximate 
form / = (1 + \y a )/{ 1 + ky) (times an uninteresting constant) where k — V/v 0 . (c) Write down the 
Euler-Lagrange equation that determines the best path. To solve it, make the intelligent guess that 
y(x) = A ,x(D — x), which clearly passes through the two towns. Show that it satisfies the Euler- 
Lagrange equation, provided A = (V4 + 2 k 2 D 2 — 2)/(kD 2 ). How far north does this path take the 
plane, if D = 2000 miles, v 0 = 500 mph, and the wind shear is V = 0.5 mph/mi? How much time does 
the plane save by following this path? [You’ll probably want to use a computer to do this integral.] 

6.24 *** Consider a medium in which the refractive index n is inversely proportional to r 2 ; that is, 
n = a/r 2 , where r is the distance from the origin. Use Fermat’s principle, that the integral (6.3) is 
stationary, to find the path of a ray of light travelling in a plane containing the origin. [Hint: Use two- 
dimensional polar coordinates and write the path as 0 = 0(r). The Fermat integral should have the 
form f f (0, 0', r ) dr, where /(0, 0', r) is actually independent of 0. The Euler-Lagrange equation 
therefore reduces to 3//30' = const. You can solve this for 0' and then integrate to give 0 as a function 
of r. Rewrite this to give r as a function of 0 and show that the resulting path is a circle through the 
origin. Discuss the progress of the light around the circle.] 

6.25 *** Consider a single loop of the cycloid (6.26) with a fixed value of a, as shown in Figure 6.11. 
A car is released from rest at a point P 0 anywhere on the track between O and the lowest point P (that 
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Figure 6.11 Problem 6.25 


is, P 0 has parameter 0 < 0 O < n). Show that the time for the cart to roll from P 0 to P is given by the 
integral 


time(P 0 



d6 


and prove that this time is equal to 7ty/a/g. Since this is independent of the position of P Q , the cart 
takes the same time to roll from P 0 to P, whether P 0 is at O, or anywhere between O and P, even 
infinitesimally close to P. Explain qualitatively how this surprising result can possibly be true. [Hint: 
To do the mathematics, you have to make some cunning changes of variables. One route is this: Write 
0 = n —2a. and then use the relevant trig identities to replace the cosines of 6 by sines of a. Now 
substitute sin a = u and do the remaining integral.] 


section 6.4 More than Two Variables 

6.26 ** Give in detail the argument that leads from the stationary property of the integral (6.30) to the 
two Euler-Lagrange equations (6.34). 

6.27 *★ Prove that the shortest path between two points in three dimensions is a straight line. Write 
the path in the parametric form 

x=x(u), y = y(u), and z = z(u) 

and then use the three Euler-Lagrange equations corresponding to (6.34). 




CHAPTER 


Lagrange’s Equations 


The theoretical development of the laws of motion of bodies is a problem of such interest and 
importance that it has engaged the attention of all the most eminent mathematicians since 
the invention of dynamics as a mathematical science by Galileo, and especially since the 
wonderful extension which was given to that science by Newton. Among the successors of 
those illustrious men, Lagrange has perhaps done more than any other analyst to give extent 
and harmony to such deductive researches, by showing that the most varied consequences 
respecting the motions of systems of bodies may be derived from one radical formula; the 
beauty of the methods so suiting the dignity of the results as to make of his great work a 
kind of scientific poem. 

—William Rowan Hamilton, 1834 


Armed with the ideas of the calculus of variations, we are ready to set up the version 
of mechanics published in 1788 by the Italian-French astronomer and mathematician 
Lagrange (1736-1813). The Lagrangian formulation has two important advantages 
over the earlier Newtonian formulation. First, Lagrange’s equations, unlike Newton’s, 
take the same form in any coordinate system. Second, in treating constrained systems, 
such as a bead sliding on a wire, the Lagrangian approach eliminates the forces of 
constraint (such as the normal force of the wire, which constrains the bead to remain 
on the wire). This greatly simplifies most problems, since the constraint forces are 
usually unknown, and this simplification comes at almost no cost, since we usually 
do not want to know these forces anyway. 

In Section 7.1,1 prove that Lagrange’s equations are equivalent to Newton’s second 
law for a particle moving unconstrained in three dimensions. The extension of this 
result to N unconstrained particles is surprisingly straightforward, and I leave the 
details for you to supply (Problem 7.7). In the next few sections, I take up the harder, 
and more interesting, case of constrained systems. I begin with some simple examples 
and important definitions (such as degrees of freedom). Then, in Section 7.4,1 prove 
Lagrange’s equations for a particle constrained to move on a curved surface (leaving 
the general case to Problem 7.13). Section 7.5 offers several examples, some of which 
are distinctly easier to set up in the Lagrangian formulation than in the Newtonian. In 
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Section 7.6,1 introduce the curious terminology of “ignorable coordinates.” Finally, 
after some summarizing remarks in Section 7.7, the chapter concludes with three 
sections on topics which, although very important, could be omitted on a first reading. 
In Section 7.8,1 discuss how the laws of energy and momentum conservation appear 
in Lagrangian mechanics. Section 7.9 describes how Lagrange’s equations can be 
extended to include magnetic forces, and Section 7.10 is an introduction to the idea 
of Lagrange multipliers. 

Throughout this chapter, except in Section 7.9, I treat only the case that all 
nonconstraint forces are conservative or can, at least, be derived from a potential 
energy function. This restriction can be significantly relaxed, but already includes 
most of the applications that you are likely to meet in practice. 


7.1 Lagrange’s Equations for Unconstrained Motion 


Consider a particle moving unconstrained in three dimensions, subject to a conserva¬ 
tive net force F(r). The particle’s kinetic energy is, of course, 

T = \mv 2 = \mr 2 = \m(x 2 + y 2 + z 2 ), (7.1) 

and its potential energy is 


U = U(r) = U(x,y,z). (7.2) 

The Lagrangian function, or just Lagrangian, is defined as 

L = T -U. (7.3) 


Notice first that the Lagrangian is the KE minus the PE. It is not the same as the 
total energy. You are certainly entitled to ask why the quantity T — U should be of 
any interest. There seems to be no simple answer to this question except that it is, as 
we shall see directly. Notice also that I am using a script £ for the Lagrangian 1 (to 
distinguish it from the angular momentum L and a length L) and that £ depends on the 
particle’s position (x, y, z) and its velocity (x,y, z); that is, £ = &(x, y, z, x, y, z). 

Let us consider the two derivatives, 


9£ _ _dU_ 
dx dx 


(7.4) 


and 


9 £ 

dx 


(7.5) 


1 This notation gets into difficulty in field theories where the Lagrangian is often denoted by L, 
and £ is used for the Lagrangian density, but this won’t be a problem for us. 
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Differentiating the second equation with respect to time and remembering Newton’s 
second law, F x = p x (I take for granted that our coordinate frame is inertial), we see 
that 

(7.6) 

dx dt dx 

In exactly the same way we can prove corresponding equations in y and z. Thus 
we have shown that Newton’s second law implies the three Lagrange equations (in 
Cartesian coordinates so far): 

dt _ d_frt dt _ d_dt_ and dt _ d_dt^ 
dx dt dx ’ dy dt dy ’ 3z dt dz 

You can easily check that the argument just given works equally well in reverse, so that 
(for a single particle in Cartesian coordinates, at least) Newton’s second law is exactly 
equivalent to the three Lagrange equations (7.7). The particle’s path as determined 
by Newton’s second law is the same as the path determined by the three Lagrange 
equations. 

Our next step is to recognize that the three equations of (7.7) have exactly the 
form of the Euler-Lagrange equations (6.40). Therefore, they imply that the integral 
S = f t dt is stationary for the path followed by the particle. That this integral, called 
the action integral, is stationary for the particle’s path is called Hamilton’s principle 2 
(after its inventor, the Irish mathematician, Hamilton, 1805-1865) and can be restated 
as follows: 


Hamilton’s Principle 

The actual path which a particle follows between two points 1 and 2 in a given 
time interval, /j to t 2 , is such that the action integral 

S= f\dt (7.8) 

Jt } 

is stationary when taken along the actual path. 


Although we have so far proved this principle only for a single particle and in Cartesian 
coordinates, we are going to find that it is valid for a huge class of mechanical systems 
and for almost any choice of coordinates. 

So far we have proved for a single particle that the following three statements are 
exactly equivalent: 


2 Try not to be confused by the unlucky circumstance that Hamilton’s principle is one possible 
statement of the Lagrangian formulation of classical mechanics (as opposed to the Hamiltonian 
formulation). 
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1. A particle’s path is determined by Newton’s second law F = ma. 

2. The path is determined by the three Lagrange equations (7.7), at least in 
Cartesian coordinates. 

3. The path is determined by Hamilton’s principle. 

Hamilton’s principle has found generalizations in many fields outside classical 
mechanics (field theories, for example) and has given a unity to various diverse areas 
of physics. In the twentieth century it has played an important role in the formulation 
of quantum theories. However, for our present purposes its great importance is that it 
lets us prove that Lagrange’s equations hold in more-or-less any coordinate system: 

Instead of the Cartesian coordinates r = (x,y,z), suppose that we wish to use 
some other coordinates. These could be spherical polar coordinates (r,0,(f>), or 
cylindrical polars (p, 0, z), or any set of “generalized coordinates” q x , q 2 , q 3 , with the 
property that each position r specifies a unique value of (q h q 2 , q 3 ) and vice versa; 
that is, 

q t = q i (r) for i — 1, 2, and 3, (7.9) 

and 

r = r(q l ,q 2 ,q 3 ). (7.10) 

These two equations guarantee that for any value of r = (x,y,z) there is a unique 
{q\,q 2 , q 3 ) and vice versa. Using (7.10) we can rewrite (x,y,z) and (i, y, z) in terms 
of (q h q 2 , q 3 ) and (q h q 2 , q 3 ). Next, we can rewrite the Lagrangian £ = \m r 2 — U (r) 
in terms of these new variables as 

£ = £ (q 1 , 43’ 4b 4i’ 43 ) 

and the action integral as 

S= [ &(4b42’43’4b42’43)dt. 

Now, the value of the integral S is unaltered by this change of variables. Therefore, 
the statement that S is stationary for variations of the path around the correct path 
must still be true in our new coordinate system, and, by the results of Chapter 6, this 
means that the correct path must satisfy the three Euler-Lagrange equations, 

d£_d_&£ d£_ _ d d£ ^ 3L_ d_d£ 

3 q x dt 3^i ’ dq 2 dt 3 q 2 ’ 3<?3 dt 3 q 3 

with respect to the new coordinates q h q 2 , and q 3 . Since these new coordinates are 
any set of generalized coordinates, the qualification “in Cartesian coordinates” can be 
omitted from the statement (2) above. This result — that Lagrange’s equations have 
the same form for any choice of generalized coordinates — is one of the two main 
reasons that the Lagrangian formalism is so useful. 

There is one point about our derivation of Lagrange’s equations that is worth 
keeping at the back of your mind. A crucial step in our proof was the observation 
that (7.6) was equivalent to Newton’s second law F x = p x , which in turn is true only 
if the original frame in which we wrote down £ = T — U is inertial. Thus, although 
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Lagrange’s equations are true for any choice of generalized coordinates q h q 2 , g 3 — 
and these generalized coordinates may in fact be the coordinates of a noninertial 
reference frame — we must nevertheless be careful that, when we first write down 
the Lagrangian £ = T — U, we do so in an inertial frame. 

We can easily generalize Lagrange’s equations to systems of many particles, but 
let us first look at a couple of simple examples. 


example 7.1 One Particle in Two Dimensions; Cartesian Coordinates 

Write down Lagrange’s equations in Cartesian coordinates for a particle mov¬ 
ing in a conservative force field in two dimensions and show that they imply 
Newton’s second law. (Of course, we have already proved this, but it is worth 
seeing it worked out explicitly.) 

The Lagrangian for a single particle in two dimensions is 

£ = £(*, y,x, y) = T — U = |m(i 2 + y 2 ) - U(x, y). (7.12) 

To write down the Lagrange equations we need the derivatives 


3£ _ _dU_ 
dx dx 


and 


3£ _ dT_ 
dx dx 


(7.13) 


with corresponding expressions for the y derivatives. Thus the two Lagrange 
equations can be rewritten as follows: 


a£ _ d_dL 

dx dt dx 
3£ _ d_ 9£ 
3y dt 3 y 


F x — mx 
F y = my 


F = ma. (7.14) 


Notice how in (7.13) the derivative dL/dx is the x component of the force, and 
3 £/dx is the x component of the momentum (and similarly with the y components). 
When we use generalized coordinates q h q 2 , • • •, q n , we shall find that 3£/3 q t , 
although not necessarily a force component, plays a role very similar to a force. 
Similarly, 3£ /dq h although not necessarily a momentum component, acts very like a 
momentum. For this reason we shall call these derivatives the generalized force and 
generalized momentum respectively; that is, 

—- = (zth component of generalized force) (7.15) 

dq t 

and 

— = (zth component of generalized momentum). (7.16) 

With these notations, each of the Lagrange equations (7.11) 

3£ _ d_ 3£ 
dq i dt 3 q t 
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takes the form 

(generalized force) = (rate of change of generalized momentum) (7.17) 
I shall illustrate these ideas in the next example. 


example 7.2 One Particle in Two Dimensions; Polar Coordinates 

Find Lagrange’s equations for the same system, a particle moving in two dimen¬ 
sions, using polar coordinates. 

As in all problems in Lagrangian mechanics, our first task is to write down 
the Lagrangian L — T — U in terms of the chosen coordinates. In this case we 
have been told to use polar coordinates, as sketched in Figure 7.1. This means 
the components of the velocity are v r = r and = r0, and the kinetic energy 
is T = \mv 2 = \m(r 2 + r 2 0 2 ). Therefore, the Lagrangian is 

L=L(r,0,r,0) = r-f/ = ±m(V 2 + r 2 0 2 )-t/(r,0). (7.18) 

Given the Lagrangian, we now have only to write down the two Lagrange 
equations, one involving derivatives with respect to r and the other derivatives 
with respect to 0. 

The r Equation 

The equation involving derivatives with respect to r (the r equation) is 

ac _ d_d£ 

dr dt dr 
or 

mr0 2 — — = — (mr) = mr. (7.19) 

dr dt 

Since —dU/dr is just F r , the radial component of F, we can rewrite the r 
equation as 


F r =m(r - r0 2 ), 


(7.20) 



v = rr + r(f><l> 


Figure 7.1 The velocity of a particle expressed in 
two-dimensional polar coordinates. 
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which you should recognize as F r — ma r , the r component of F = ma, first j 
derived in Equation (1.48). (The term — r<p 2 is the infamous centripetal accel- j 
eration.) That is, when we use polar coordinates (r, 0), the Lagrange equa¬ 
tion corresponding to r is just the radial component of Newton’s second law. 
(Note, however, that the Lagrangian derivation avoided the tedious calculation 
of the components of the acceleration.) As we shall see directly, the 0 equation 
works a bit differently and illustrates a remarkable feature of the Lagrangian 
approach. 


The 0 Equation 

The Lagrange equation for the coordinate 0 is 

_ d_ dL 
30 dt 30 


or, substituting (7.18) for L, 


W_ 

3 0 


= — (mr (/>). 
dt 


(7.21) 


(7.22) 


To interpret this equation, we need to relate the left side to the appropriate 
component of the force F = — Vf/. This requires that we know the components 
of Vt7 in polar coordinates: 


WU = 


3 £/, 

^ r+ 


1 3 U * 

- 0 . 

r 30 


(7.23) 


(If you don’t remember this, see Problem 7.5.) The 0 component of the force is 
just the coefficient of 0 in F = — Vf/, that is, 


® r 30 

Thus the left side of (7.22) is rF^, which is simply the torque T on the particle 
about the origin. Meanwhile, the quantity rar 2 0 on the right can be recognized 
as the angular momentum L about the origin. Therefore, the 0 equation (7.22) 
states that 


T = 


dL 


(7.24) 


the familiar condition from elementary mechanics, that torque equals the rate 
of change of angular momentum. 


The result (7.24) illustrates a wonderful feature of Lagrange’s equations, that when 
we choose an appropriate set of generalized coordinates the corresponding Lagrange 
equations automatically appear in a corresponding, natural form. When we choose 
r and 0 for our coordinates, the 0 equation turns out to be the equation for angular 
momentum. In fact, the situation is even better than this. Recall that I introduced the 
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notion of generalized force and generalized momentum in (7.15) and (7.16). In the 
present case, the 0 component, of the generalized force is just the torque, 

(0 component of generalized force) = = T (torque) (7.25) 

dtp 

and the corresponding component of the generalized momentum is 

(0 component of generalized momentum) = ^ = L (angular momentum). (7.26) 

dtp 

With the “natural” choice for the coordinates (r and 0) the 0 components of the 
generalized force and momentum turn out to be the corresponding “natural” quantities, 
the torque and the angular momentum. 

Notice that the generalized “force” does not necessarily have the dimensions of 
force, nor the generalized “momentum” those of momentum. In the present case, 
the generalized force (0 component) is a torque (that is, force x distance) and the 
generalized momentum is an angular momentum (momentum x distance). 

This example illustrates another feature of Lagrange’s equations: The tp component 
3£/30 of the generalized force turned out to be the torque on the particle. If the 
torque happens to be zero, then the corresponding generalized momentum 3£/30 
(the angular momentum, in this case) is conserved. Clearly this is a general result: 
The ith component of the generalized force is 3£ / dq t . If this happens to be zero, then 
the Lagrange equation 

3£ _ d_ dL 
3 q t dt dcfo 

says simply that the ith component 3£/3 q t of the generalized momentum is constant. 
That is, if £ is independent of q h the ith component of the generalized force is zero, 
and the corresponding component of the generalized momentum is conserved. In 
practice, it is often easy to spot that a Lagrangian is independent of a coordinate q h 
and, if you can, then you immediately know a corresponding conservation law. We 
shall return to this point in Section 7.8. 


Several Unconstrained Particles 

The extension of the above ideas to a system of N unconstrained particles (a gas of 
N molecules, for instance) is very straightforward, and I shall leave you to fill in the 
details (Problems 7.6 and 7.7). Here I shall just sketch the argument for the case of 
two particles, mainly to show the form of Lagrange’s equations for N > l. For two 
particles, the Lagrangian is defined (exactly as before) as £ = T — U, but this now 
means that 


£(i*i, r 2 , r h r 2 ) = \m x x\ + \m 2 r| - U (r 1? r 2 ). (7.27) 

As usual, the forces on the two particles are Fj = — Vjt/ and F 2 = — V 2 t/. Newton’s 
second law can be applied to each particle and yields six equations, 


Flx ~ Plx’ 


Fly = Ply , 


F 2 z = Plz- 
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Exactly as in Equation (7.7), each of these six equations is equivalent to a correspond¬ 
ing Lagrange equation 

3C _ d 9C 3£_d_3,C d£_d_ ac (7 

3^! dt 3ij ’ dy x dt dy t ’ ’ 3 z 2 dt 3z 2 

These six equations imply that the integral S = Ldt is stationary. Finally, we can 
change to any other suitable set of six coordinates q h q 2 , • • •, g 6 . The statement that S 
is stationary must also be true in this new coordinate system, and this implies in turn 
that Lagrange’s equations must be true with respect to the new coordinates: 


3£_rf_3£ 3£ _ d 3£ 

3 qi dt 3 q x ’ 3 q 2 dt dq 2 ’ 


ac 

dqe 


d 3C 
dt dq 6 


(7.29) 


An example of a set of six such generalized coordinates that we shall use repeatedly 
in Chapter 8 is this: In place of the six coordinates of r { and r 2 , we could use the 
three coordinates of the CM position R = (rn l r i + m 2 r 2 )/(m 1 + m 2 ) and the three 
coordinates of the relative position r = rj — r 2 . We shall find that this choice of 
coordinates leads to a dramatic simplification. For now, the main point is simply that 
Lagrange’s equations are automatically true in their standard form (7.29) with respect 
to these new, generalized coordinates. 

The extension of these ideas to the case of N unconstrained particles is entirely 
straightforward, and I leave it for you to check. (See Problem 7.7.) The upshot is that 
there are 3 N Lagrange equations 


3L _ d_ 3£ 

3 q t dt 3 q t ’ 


[i = 1, 2, • • •, 3A], 


valid for any choice of the 3 N coordinates q h ■ ■ • ,q 3N needed to describe the N 
particles. 


7.2 Constrained Systems; an Example 


Perhaps the greatest advantage of the Lagrangian approach is that it can handle systems 
that are constrained so that they cannot move arbitrarily in the space that they occupy. 
A familiar example of a constrained system is the bead which is threaded on a wire -h* 
the bead can move along the wire, but not anywhere else. Another example of a very 
constrained system is a rigid body, whose individual atoms can only move in such a 
way that the distance between any two atoms is fixed. Before I discuss the nature of 
constraints in general, I shall discuss another simple example, the plane pendulum. 

Consider the simple pendulum shown in Figure 7.2. A bob of mass m is fixed to a 
massless rod, which is pivoted at O and free to swing without friction in the xy plane. 
The bob moves in both the x and y directions, but it is constrained by the rod so 
that y/x 2 + y 2 = l remains constant. In an obvious sense, only one of the coordinates 
is independent (as x changes, the variation of y is predetermined by the constraint 
equation), and we say that the system has only one degree of freedom. One way to 
express this would be to eliminate one of the coordinates, for instance by writing 
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m 


Figure 7.2 A simple pendulum. The bob of mass m is con¬ 
strained by the rod to remain at distance l from O. 


y = V/ 2 — x 2 and expressing everything in terms of the one coordinate x. Although 
this is a perfectly legitimate way to proceed, a simpler way is to express both x and y in 
terms of the single parameter 0, the angle between the pendulum and its equilibrium 
position, as shown in Figure 7.2. 

We can express all the quantities of interest in terms of 0. The kinetic energy is 
T = \mv 2 = tm/ 2 0 2 . The potential energy is U = mgh where h denotes the height of 
the bob above its equilibrium position and is (as you should check) h = 1(1 — cos 0). 
Thus the potential energy is U = mgl( 1 — cos 0), and the Lagrangian is 

H = T — U = jml 2 <p 2 — mgl(l — cos0). (7.30) 


Whichever way we choose to proceed — to write everything in terms of x (or y) or 
0 — the Lagrangian is expressed in terms of a single generalized coordinate q and its 
time derivative q, in the form L = L (q, q). Now, it is a fact (which I shall not prove 
just yet) that once the Lagrangian is written in terms of this one variable (for a system 
with one degree of freedom), the evolution of the system again satisfies Lagrange’s 
equation (just as we proved for an unconstrained particle in the previous section.) 
That is, 


3L _ d_ ac 

3 q dt 3 q 


(7.31) 


If we choose the angle 0 as our generalized coordinate, then Lagrange’s equation 
reads 


3L _ d_ 3L 
30 dt 30 


(7.32) 


The Lagrangian £» is given by (7.30), and the needed derivatives are easily evaluated 
to give 


-mgl sin 0 = 


-(m/ 2 0) = m/ 2 0. 


(7.33) 
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Referring to Figure 7.2 you can see that the left side of this equation is just the torque 
T exerted by gravity on the pendulum, while the term ml 2 is the pendulum’s moment 
of inertia I. Since 0 is the angular acceleration a, we see that Lagrange’s equation for 
the simple pendulum simply reproduces the familiar result T = la. 


7.3 Constrained Systems in General _ 

Generalized Coordinates 

Consider now an arbitrary system of N particles, a = 1, • • •, N with positions r a . 
We say that the parameters q\,---,q n are a set of generalized coordinates for the 
system if each position r a can be expressed as a function of q h ■ ■ •, q n , and possibly 
the time t, 


r a = r a (qu ■ ■ ■, q„, 0 [«=%••, N], ( 7 . 34 ) 

and conversely each q i can be expressed in terms of the r a and possibly t, 

qi =0,-(ri, [i = 1, •••,«]. (7.35) 

In addition, we require that the number of the generalized coordinates ( n ) is the 
smallest number that allows the system to be parametrized in this way. In our three- 
dimensional world, the number n of generalized coordinates for N particles is cer¬ 
tainly no more than 3N and, for a constrained system, is usually less — sometimes 
dramatically so. For example, for a rigid body, the number of particles N may be of 
order 10 23 , whereas the number of generalized cordinates n is 6 (three coordinates to 
give the position of the center of mass and three to give the orientation of the body). 

To illustrate the relation (7.34), consider again the simple pendulum of Figure 7.2. 
There is one particle (the bob) and two Cartesian coordinates (since the pendulum is 
restricted to two dimensions). As we saw, there is just one generalized coordinate, 
which we took to be the angle 0. The analog of (7.34) is 

r = (x, y) = (/ sin0, l cos0) (7.36) 

and expresses the two Cartesian coordinates x and y in terms of the one generalized 
coordinate 0. 

The double pendulum shown in Figure 7.3 has two bobs, both confined to a plane, 
so it has four Cartesian coordinates, all of which can be expressed in terms of the two 
generalized coordinates 0! and 0 2 . Specifically, if we put our origin at the suspension 
point of the top pendulum, 


m (/] sin 0 1; l x cos 0j) = r 1 (0 1 ) 


(7.37) 


and 


r 2 = (/ x sin 0j + Z 2 sin0 2 , Z 1 cos0 1 + Z 2 cos0 2 ) = r 2 (0!, 0 2 ). (7.38) 


Notice that the components of r 2 depend on 0, and 0 2 . 
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Figure 7.3 The positions of both masses in a double pendulum 
are uniquely specified by the two generalized coordinates 4> x and 
0 2 , which can themselves be varied independently. 


In these two examples, the transformation between the Cartesian and the general¬ 
ized coordinates did not depend on the time t, but it is easy to think of examples in 
which it does. Consider the railroad car shown in Figure 7.4, which has a pendulum 
suspended from its ceiling and is being forced 3 to accelerate with a fixed accelera¬ 
tion a. It is natural to specify the position of the pendulum by the angle 0 as usual, 
but we must recognize that, in the first instance, this gives the pendulum’s position 
relative to the accelerating, and hence non-inertial, reference frame of the car. If we 
wish to specify the bob’s position relative to an inertial frame, we can choose a frame 
fixed relative to the ground, and we can easily express the position relative to this 
inertial frame in terms of the angle 0. The position of the point of suspension relative 
to the ground is (if we choose our axes and origin properly) just x s — \at 2 , and the 
position of the bob is then easily seen to be 

r = (x, y) = (/ sin 0 + \at 2 , l cos 0) = r(0, t ). (7.39) 



Figure 7.4 A pendulum is suspended from the roof of a rail¬ 
road car that is being forced to accelerate with a fixed, known 
acceleration a. 


3 The word “forced” is often used to describe a motion that is imposed by some outside agent and 
is unaffected by the internal motions of the system. In the present example, the “forced” acceleration 
of the car is assumed to be the same whatever the oscillations of the pendulum. 
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The relation between r and the generalized coordinate <f) depends on the time t, a 
possibility that I allowed for when writing (7.34). 

We shall sometimes describe a set of coordinates q\,---,q n as natural if the 
relation (7.34) between the Cartesian coordinates r a and the generalized coordinates 
does not involve the time t. We shall find certain convenient properties of natural 
coordinates that do not generally apply to coordinates for which (7.34) does involve 
the time. Fortunately, as the name implies, there are many problems for which the 
most convenient choice of coordinates is also natural. 4 

Degrees of Freedom 

The number of degrees of freedom of a system is the number of coordinates that 
can be independently varied in a small displacement — the number of independent 
“directions” in which the system can move from any given initial configuration. For 
example, the simple pendulum of Figure 7.2 has just one degree of freedom, while the 
double pendulum of Figure 7.3 has two. A particle that is free to move anywhere in 
three dimensions has three degrees of freedom, while a gas comprised of N particles 
has 3 N. 

When the number of degrees of freedom of an A-particle system in three dimen¬ 
sions is less than 3 N, we say that the system is constrained. (In two dimensions, the 
corresponding number is 2 N of course.) The bob of a simple pendulum, with one 
degree of freedom, is constrained. The two masses of a double pendulum, with two 
degrees of freedom, are constrained. The N atoms of a rigid body have just six degrees 
of freedom and are certainly constrained. Other examples are a bead constrained to 
slide on a fixed wire and a particle constrained to move on a fixed surface in three 
dimensions. 

In all of the examples I have given so far, the number of degrees of freedom 
was equal to the number of generalized coordinates needed to describe the system’s 
configuration. (The double pendulum has two degrees of freedom and needs two 
generalized coordinates, and so on.) A system with this natural-seeming property is 
said to be holonomic. 5 That is, a holonomic system has n degrees of freedom and 
can be described by n generalized coordinates, q h ■ • ■, q n . Holonomic systems are 
easier to treat than nonholonomic, and in this book I shall restrict myself to holonomic 
systems. 

You might imagine that all systems would be holonomic, or at least that nonholo¬ 
nomic systems would be rare and bizarrely complicated. In fact, there are some quite 
simple examples of nonholonomic systems. Consider, for instance, a hard rubber ball 
that is free to roll (but not to slide nor to spin about a vertical axis) on a horizontal 
table. Starting at any position (x, y) it can move in only two independent directions. 
Therefore, the ball has two degrees of freedom, and you might well imagine that its 


4 Natural coordinates are sometimes called scleronomous, and those that are not natural, 
rheonomous. I shall not use these outstandingly forgettable names. Nonnatural coordinates are 
also sometimes called forced, since a time dependence in the relation (7.34) is usually associated 
with a forced motion, such as the forced acceleration of the car in Figure 7.4. 

5 Many different definitions of “holonomic” can be found, not all of which are exactly equivalent. 
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Figure 7.5 The right triangle OPQ lies in the xy plane with sides 
OP and PQ of length c. If you roll a ball of circumference c around 
OPQ , it will return to its starting point with a changed orientation. 


configuration could be uniquely specified by two coordinates, x and y, of its center. 
But consider the following: Let us place the ball at the origin O and make a mark on 
its top point. Now, carry out the following three moves. (See Figure 7.5.) Roll the ball 
along the x axis for a distance equal to the circumference c, to a point P, where the 
mark will once again be on the top. Now roll it the same distance c in the y direction to 
Q, where the mark is again on top. Finally roll it straight back to the origin along the 
hypotenuse of the triangle OPQ. Since this last move has length V2c, it brings the ball 
back to its starting point, but with the mark no longer on the top. The position (x, y) 
has returned to its initial value, but the ball now has a different orientation. Evidently 
the two coordinates (x, y ) are not enough to specify a unique configuration. In fact, 
three more numbers are needed to specify the orientation of the ball, and we need five 
coordinates in all to specify the configuration completely. The ball has two degrees 
of freedom but needs five generalized coordinates. Evidently it is a nonholonomic 
system. 

Although nonholonomic systems certainly exist, they are more complicated to an¬ 
alyze than holonomic systems, and I shall not discuss them further. For any holonomic 
system with generalized coordinates q x ,---,q n and potential energy U (q\, ■ • •, q n , t) 
(which may depend on the time t as described in Section 4.5), the evolution in time 
is determined by the n Lagrange equations 


3£ _ d_ 3£ 
3 q t dt dq t 




(7.40) 


where the Lagrangian £ is defined as usual to be £ = T — U. I shall prove this result 
in Section 7.4. 


7.4 Proof of Lagrange’s Equations with Constraints 


We are now ready to prove Lagrange’s equations for any holonomic system. To keep 
things reasonably simple, I shall treat explicitly the case that there is just one particle. 
(The generalization to arbitrary numbers of particles is fairly straightforward — see 
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Problem 7.13.) To be definite, I shall suppose the particle is constrained to move on a 
surface. 6 This means that it has two degrees of freedom and can be described by two 
generalized coordinates q x and q 2 that can vary independently. 

We must recognize that there are two kinds of forces on the particle (or particles, in 
the general case). First, there are the forces of constraint: For a bead on a wire the con¬ 
straining force is the normal force of the wire on the bead; for our particle, constrained 
to move on a surface, it is the normal force of the surface. For the atoms in a rigid 
body, the constraining forces are the interatomic forces that hold the atoms in place 
within the body. In general, the forces of constraint are not necessarily conservative, 
but this doesn’t matter. One of the objectives of the Lagrangian approach is to find 
equations that do not involve the constraining forces, which we usually don’t want to 
know anyway. (Notice, however, that if the constraining forces are nonconservative, 
Lagrange’s equations in the simple unconstrained form of Section 7.1 certainly do not 
apply.) I shall denote the net constraining force on the particle by F cstr , which in our 
case is just the normal force of the surface to which the particle is confined. 

Second, there are all the other “nonconstraint” forces on the particle, such as 
gravity. These are the forces with which we are usually concerned in practice, and 
I shall denote their resultant by F. I shall assume that the nonconstraint forces all 
satisfy at least the second condition for conservatism, so that they are derivable from 
a potential energy, U (r, t), and 


F = —VC/(r, t). (7.41) 

(If all the nonconstraint forces are actually conservative, then U is independent of t, 
but we don’t need to assume this.) The total force on our particle is F tot = F cstr + F. 
Finally, I shall define the Lagrangian, as usual, to be 

L = T -U. (7.42) 

Since U is the potential energy for the nonconstraint forces only, this definition of L 
excludes the constraint forces. This correctly reflects that Lagrange’s equations for a 
constrained system cleverly eliminate the constraint forces, as we shall see. 


The Action Integral is Stationary at the Right Path 

Consider any two points iq and r 2 through which the particle passes at times t\ and 
t 2 . I shall denote by r(f) the “right” path, the actual path that our particle follows 
between the two points, and by R(f) any neighboring “wrong” path between the same 
two points. It is convenient to write 

R(f)=r(0 + «(0, ' (7.43) 


6 Actually, it is a bit hard to imagine how to constrain a particle to a single surface so that it 
can’t jump off. If this worries you, you can imagine the particle sandwiched between two parallel 
surfaces with just enough gap between them to let it slide freely. 
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which defines e(t) as the infinitesimal vector pointing from r(t) on the right path to 
the corresponding point R (t) on the wrong path. Since I shall assume that both of 
the points r (t) and R(t) lie in the surface to which the particle is confined, the vector 
e(t) is contained in the same surface. Since both r(r) and R(t) go through the same 
endpoints, e(t) = 0 at f, and t 2 . 

Let us denote by S the action integral 

5= | £(R,R ,t)dt, (7.44) 

'h 

taken along any path R (t) lying in the constraining surface, and by S 0 the correspond¬ 
ing integral taken along the right path r(r). As I shall now prove, the integral S is 
stationary for variations of the path R(t), when R(r) = r(t) or, equivalently, when 
the difference € is zero. Another way to say this is that the difference in the action 
integrals 


SS = S-S 0 (7.45) 

is zero to first order in the distance € between the paths, and this is what I shall prove. 

The difference (7.45) is the integral of the difference between the Lagrangians on 
the two paths, 

SC = C(R,R,t) -C(r,r,t). (7.46) 


If we substitute R(t) = r(t) + e(t) and 


£(r, r, 0 = T - U = \mr 2 - U( r, t), 

this becomes 7 

SC = \m [(r + €) 2 - r 2 ] - [U(r + e,t)~ U( r, f)] 

= mre-€-VU + 0(e 2 ), 

where 0(e 2 ) denotes terms involving squares and higher powers of e and k. Returning 
to the difference (7.45) in the two action integrals, we find that, to first order in e, 


8S= [ 2 SCdt= ( 2 [mr-k- c-VU]dt. (7.47) 

Jti Jt x 

The first term in the final integral can be integrated by parts. (Recall that this just 
means moving the time derivative from one factor to the other and changing the sign.) 
The difference e is zero at the two endpoints, so the endpoint contribution is zero, and 
we get 


SS = - 


c[mr + VU]dt. 


(7.48) 


7 To understand the second term in the second line, recall that /( r + e) — /(r) ~ e ■ V/, for 
any scalar function /(r). See Section 4.3. 
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Now, the path r(t) is the “right” path and satisfies Newton’s second law. Therefore the 
term mr is just the total force on the particle, F tot = F cstr + F. Meanwhile WU = F. 
Therefore, the second term in (7.48) cancels the second piece of the first, and we are 
left with 

8 S = — f 2 e - ¥ cstt dt. (7.49) 

'h 

But the constraint force F cstr is normal to the surface in which our particle moves, 
while e lies in the surface. Therefore € • F cstr = 0, and we have proved that <55 = 0. 
That is, the action integral is stationary at the right path, as claimed. 8 


The Final Proof 


We have proved Hamilton’s principle, that the action integral is stationary at the path 
which the particle actually follows. However, we have proved it, not for arbitrary 
variations of the path, but rather for those variations of path that are consistent with 
the constraints — that is, paths which lie in the surface to which our particle is 
constrained. This means that we cannot prove Lagrange’s equations with respect to 
the three Cartesian coordinates. On the other hand, we can prove them with respect to 
the appropriate generalized coordinates. We are assuming that our particle is confined 
by holonomic constraints to move on a surface, that is, a two-dimensional subset 
of the full three-dimensional world. This means that the particle has two degrees of 
freedom and can be described by two generalized coordinates, q l and q 2 , that can be 
varied independently. Any variation of q x and q 2 is consistent with the constraints. 9 
Accordingly, we can rewrite the action integral in terms of q x and q 2 as 


5 = 



x ,q 2 ,q x ,q 2 ,t)dt. 


(7.50) 


and this integral is stationary for any variations of q x and q 2 about the correct path 
[q x {t), q 2 (t)]. Therefore, by the argument of Chapter 6 the correct path must satisfy 
the two Lagrange equations 


and _ d dL 

dq x dt dq x dq 2 dt 9 q 2 


(7.51) 


The proof that I have given here applies directly only to a single particle in three 
dimensions, constrained to move on a two-dimensional surface, but the main ideas 
of the general case are all present. The generalization is, for the most part, relatively 


8 The observation that the integrand in (7.49) is zero is really the crucial step in our proof. When 
you consider the generalization of the proof to an arbitrary constrained system (for instance, if you 
do Problem 7.13), you will find that there is a corresponding step and that the corresponding term 
is zero, for the same reason: The forces of constraint would do no work in a displacement that is 
consistent with the constraints. Indeed this is one possible definition of a force of constraint. 

9 For example, if our surface is a sphere, centered at the origin, then the generalized coordinates 
q x , q 2 could be the two angles 6, 0 of spherical polar coordinates. Any variation of 6 and 0 is 
consistent with the constraint that the particle remain on the sphere. 
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straightforward (see Problem 7.13), and meanwhile I hope that I have said enough 
to convince you of the truth of the general result: For any holonomic system, with n 
degrees of freedom and n generalized coordinates, and with the nonconstraint forces 
derivable from a potential energy U(q h ■ ■ ■ the path followed by the system 

is determined by the n Lagrange equations 


ac _ d_dz 

dqi dt dq-, 


1 . • * * - » 1 . 


(7.52) 


where L is the Lagrangian L = T — U and U = U(qy, ■ ■ ■, q n , t) is the total potential 
energy corresponding to all the forces excluding the forces of constraint. 

It was essential to our proof of Lagrange’s equations that the nonconstraint forces 
be conservative (or, at a minimum, that they satisfy the second condition for conser¬ 
vatism) so that they are derivable from a potential energy, F = — VU. If this is not 
true, then Lagrange’s equations may not hold, at least in the form (7.52). An obvi¬ 
ous example of a force that does not satisfy this condition is sliding friction. Sliding 
friction cannot be regarded as a force of constraint (it is not normal to the surface) 
and cannot be derived from a potential energy. Thus, when sliding friction is present, 
Lagrange’s equations do not hold in the form (7.52). Lagrange’s equations can be 
modified to include forces like friction (see Problem 7.12), but the result is clumsy 
and I shall confine myself to situations where the equations (7.52) do hold. 


7.5 Examples of Lagrange’s Equations 


In this section I present five examples of the use of Lagrange’s equations. The first two 
are sufficiently simple that they can be easily solved within the Newtonian formalism. 
My main purpose for including them is just to give you experience with using the 
Lagrangian approach. Nevertheless, even these simple cases show some advantages 
of the Lagrangian over the Newtonian formalism; in particular, we shall see how the 
Lagrangian approach obviates any need to consider the forces of constraint. The last 
three examples are sufficiently complex that solution using the Newtonian approach 
requires considerable ingenuity; by contrast, the Lagrangian approach lets us write 
down the equations of motion almost without thinking. 

The examples given here illustrate an important point to recognize about La¬ 
grange’s equations: The Lagrangian formalism always (or nearly always) gives a 
straightforward means of writing down the equations of motion. On the other hand, it 
cannot guarantee that the resulting equations are easy to solve. If we are very lucky, 
the equations of motion may have an analytic solution, but, even when they do not, 
they are the essential first step to understanding the solutions and they often suggest 
a starting point for an approximate solution. The equations of motion can give simple 
answers to certain subsidiary questions. (For instance, once we have the equations of 
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motion, we can usually find the positions of equilibrium of a system very easily.) And 
we can always solve the equations of motion numerically for given initial conditions. 

The Lagrangian method is so important that it certainly deserves more than just 
five examples. However, the crucial thing is that you work through several examples 
yourself; therefore I have given plenty of problems at the end of the chapter, and it is 
essential that you work several of these as soon as possible after reading this section. 


example 7.3 Atwood’s Machine 

Consider the Atwood machine first met in Figure 4.15 and shown again in Figure 
7.6, in which the two masses m x and m 2 are suspended by an inextensible string 
(length /) which passes over a massless pulley with frictionless bearings and 
radius R. Write down the Lagrangian £, using the distance x as generalized co¬ 
ordinate, find the Lagrange equation of motion, and solve it for the acceleration 
x. Compare your results with the Newtonian solution. 

Because the string has fixed length, the heights x and y of the two masses 
cannot vary independently. Rather, x + y + nR = /, the length of the string, so 
that y can be expressed in terms of x as 

y = -x + const. (7.53) 

Therefore, we can use x as our one generalized coordinate. From (7.53) we see 
that y = — x, so that the kinetic energy of the system is 

T = \m x x 2 + \m 2 y 2 = \(m x + m 2 )x 2 . 



Figure 7.6 An Atwood machine consisting of two masses, m x 
and m 2 , suspended by a massless inextensible string that passes 
over a massless, frictionless pulley of radius R. Because the 
string’s length is fixed, the position of the whole system can 
be specified by a single variable, which we can take to be the 
distance x. 
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while the potential energy is 

j U = —m x gx — m 2 gy = —(rn l — m 2 )gx + const. 

j Combining these, we find the Lagrangian 

£, = T — U = + tn 2 )x 2 + (m x — m 2 )gx, (7.54) 

j where I have dropped an uninteresting constant. 

The Lagrange equation of motion is just 

j ac _ d_ 9L 

j dx dt dx 

j or, substituting (7.54) for L, 

j (m x - m 2 )g = im x + m 2 )x, (7.55) 

j which we can solve at once to give the desired acceleration 

I .. m x — m 2 

x = - +g. (7.56) 

j m x + m 2 

By choosing m x and m 2 fairly close together, one can make this acceleration 
| much less than g , and hence much easier to measure. Therefore, the Atwood 
| machine gave an early and reasonably accurate method for measuring g. 

The corresponding Newtonian solution requires us to write down Newton’s 
j second law for each of the masses separately. The net force on m x is m x g — F t 
1 where F t is the tension in the string. (This is the force of constraint and needed 
| no consideration in the Lagrangian solution.) Thus Newton’s second law for m t 

i is 


m x g - F t = m x x. 

In the same way, Newton’s second law for m 2 reads 
F t — m 2 g = m 2 x. 

(Remember that the upward acceleration of m 2 is the same as the downward 
acceleration of m x .) We see that the Newtonian approach has given us two equa¬ 
tions for two unknowns, the required acceleration x and the force of constraint 
F v By adding these two equations, we can eliminate F t and arrive at precisely 
the equation (7.55) of the Lagrangian method and thence the same value (7.56) 
for x. 

The Newtonian solution of the Atwood machine is too simple for us to get 
very excited by an alternative solution. Nevertheless, this simple example does 
illustrate how the Lagrangian approach allows us to ignore the unknown (and 
usually uninteresting) force of constraint and to eliminate at least one step of 
the Newtonian solution. 
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example 7.4 A Particle Confined to Move on a Cylinder 

Consider a particle of mass m constrained to move on a frictionless cylinder of 
radius R, given by the equation p = R in cylindrical polar coordinates (p, 0, z), 
as shown in Figure 7.7. Besides the force of constraint (the normal force of the 
cylinder), the only force on the mass is a force F = — kr directed toward the 
origin. (This is the three dimensional version of the Hooke’s-law force.) Using 
z and 0 as generalized coordinates, find the Lagrangian £. Write down and solve 
Lagrange’s equations and describe the motion. 

Since the particle’s coordinate p is fixed at p = R, we can specify its position 
by giving just z and 0, and since these two coordinates can vary independently 
the system has two degrees of freedom and we can use (z, 0) as generalized 
coordinates. The velocity has v p = 0, = R<p, and v z = z. Therefore the kinetic 

energy is 

T = ±mv 2 = \m(R 2 j) 2 + i 2 ). 

The potential energy for the force F = —kr is (Problem 7.25) U = \kr 2 , where r 
is the distance from the origin to the particle, given by r 2 = R 2 + z 2 (see Figure 
7.7). Therefore 

U = \k(R 2 + z 2 ), 

and the Lagrangian is 

£ = \m(R 2 j) 2 + z 2 ) - \k(R 2 + z 2 ). (7.57) 

Since the system has two degrees of freedom, there are two equations of 
motion. The z equation is 


ac _ d_d ,c 

3z dt 3z 


— kz = mz. 


(7.58) 



Figure 7.7 A mass m is confined to the surface of the cylinder 
p = R and subject to a Hooke’s law force F = —kr. 
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The 0 equation is even simpler. Since £ does not depend on 0, it follows that 
| 9C/90 = 0 and the 0 equation is just 


9£ _ J_9£ 
90 dt dip 


or 


0 = — m/? 2 0. 
Jr 


(7.59) 


The z equation (7.58) tells us that the mass executes simple harmonic motion in 
the z direction, with z = A cos (cot - 8). The 0 equation (7.59) tells us that the 
quantity mR 2 <p is constant, that is, that the angular momentum about the z axis 
is conserved — a result we could have anticipated since there is no torque in 
this direction. Because p is fixed, this implies simply that 0 is constant, and the 
mass moves around the cylinder with constant angular velocity 0, at the same 
time that it moves up and down in the z direction in simple harmonic motion. 


These two examples illustrate the steps to be followed in solving any problem by 
the Lagrangian method (provided all constraints are holonomic and the nonconstraint 
forces are derivable from a potential energy, as we are assuming): 

1. Write down the kinetic and potential energies and hence the Lagrangian 
£ = T — U, using any convenient inertial reference frame. 

2. Choose a convenient set of n generalized coordinates q h ■ ■ ■, q n and find 
expressions for the original coordinates of step 1 in terms of your chosen 
generalized coordinates. (Steps 1 and 2 can be done in either order.) 

3. Rewrite £ in terms of q u ■ ■ ■, q n and q h ■ • •, q n . 

4. Write down the n Lagrange equations (7.52). 

As we shall see, these four steps provide an almost infallible route to the equations of 
motion of any system, no matter how complex. Whether the resulting equations can 
be easily solved is another matter, but even when they cannot, just having them is a 
huge step toward understanding a system and an essential step to finding approximate 
or numerical solutions. 

The next two examples illustrate how the Lagrangian approach can give the equa¬ 
tions of motion, almost effortlessly, for problems that would require considerable care 
and ingenuity using Newtonian methods. 


example 7.5 A Block Sliding on a Wedge 

Consider the block and wedge shown in Figure 7.8. The block (mass m) is free 
to slide on the wedge, and the wedge (mass M ) can slide on the horizontal table, 
both with negligible friction. The block is released from the top of the wedge, 
with both initially at rest. If the wedge has angle a and the length of its sloping 
face is l, how long does the block take to reach the bottom? 

The system has two degrees of freedom, and a good choice of the two 
generalized coordinates is, as shown, the distance q x of the block from the top 
of the wedge and the distance q 2 of the wedge from any convenient fixed point 
on the table. The quantity we need to find is the acceleration q x of the block 
relative to the wedge, since with this we can quickly find the time required to 
slide the length of the wedge. Our first task is to write down the Lagrangian, and 
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it is often safest to do this in Cartesian coordinates, and then rewrite it in terms 
of the chosen generalized coordinates. 

The kinetic energy of the wedge is just T u = \Mq 2 but that of the block 
is more complicated. The block’s velocity relative to the wedge is q x down the 
slope, but the wedge itself has a horizontal velocity q 2 relative to the table. The 
velocity of the block relative to the inertial frame of the table is the vector sum of 
these two. Resolving into rectangular components (x to the right, y downward), 
we find for the velocity of the block relative to the table 

v = ( v x , v y ) — (q x cos ck + q 2 , q x sin a). 

Thus the kinetic energy of the block is 

T m - \m{v 2 x + v 2 ) = \m(q 2 + q 2 + 2q x q 2 cosa). 

(I used the identity cos 2 a + sin 2 a = 1 to simplify this.) The total kinetic energy 
of the system is 

T = t m + T m = + m )<? 2 2 + \ m (A\ + 24^2 cos a). (7.60) 

The potential energy of the wedge is a constant, which we may as well take to 
be zero. That of the block is —mgy, where y =q x sin a is the height of the block 
measured down from the top of the wedge. Therefore 

U = —mgq x since 


and the Lagrangian is 

H — T — U — \(M + m)q 2 + \m{q 2 + 2q x q 2 cosa ) + mgq x sin a. (7.61) 

Once we have found the Lagrangian in terms of the generalized coordinates 
q x and q 2 , all we have to do is to write down the two Lagrange equations, one 
for q | and one for q 2 , and then solve them. The q 2 equation (which is a little 
simpler) is 


dC _ d 
dq 2 dt dq 2 


(7.62) 
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but, since -C in (7.61) is clearly independent of q 2 , this just tells us that the 
generalized momentum dii/dq 2 is constant, 

Mq 2 + m(q 2 + q x cos a) = const (7.63) 


— a result you will recognize as conservation of the total momentum in the x 
direction (and something you could have written down without any help from 
Lagrange). 

The q x equation 


cLC _ d_ 
dq x dt dq x 


(7.64) 


is more complicated, since neither derivative is zero. Sustituting (7.61) for £, 
we can write this as 


mg sin a = — m {q x + q 2 cos a) 
dt 


— m(q ! + # 2 cos a). 


(7.65) 


Differentiating (7.63) we see that 


q 2 -—— q x cos u, (7.66) 

M + m 

which lets us eliminate q 2 from (7.65) and solve for q x . 


g sin a 
m cos 2 a 
M + m 


(7.67) 


Armed with this value for q x we can quickly answer the original question: Since 
the acceleration down the slope is constant, the distance traveled down the slope 
in time t is \q x t 2 , and the time to travel the length l is just y/2l/q x , with q x given 
by (7.67). More interesting than this answer is to check that the formula (7.67) for 
q { agrees with common sense in various special cases. For example, if a = 90°, 

I (7.67) implies that q x = g, which is clearly right; and, if M — ► oo, (7.67) implies 
| that q x -* g sin a, which is the well-known acceleration for a block on a fixed 
j incline and clearly makes sense. I leave it as an exercise (Problem 7.19) to 
j check that in the limit that M 0, our answers agree with what you could 
j have predicted. 

| example 7.6 A Bead on a Spinning Wire Hoop 

} A bead of mass m is threaded on a frictionless circular wire hoop of radius R. 
The hoop lies in a vertical plane, which is forced to rotate about the hoop’s 
vertical diameter with constant angular velocity <p = oo, as shown in Figure 
7.9. The bead’s position on the hoop is specified by the angle 9 measured up 
from the vertical. Write down the Lagrangian for the system in terms of the 
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Figure 7.9 A bead is free to move around the frictionless wire 
hoop, which is spinning at a fixed rate a> about its vertical axis. 
The bead’s position is specified by the angle <9; its distance from 
the axis of rotation is p = R sin 6. 


generalized coordinate 0 and find the equation of motion for the bead. Find any 
equilibrium positions at which the bead can remain with 0 constant, and explain 
their locations in terms of statics and the “centrifugal force” mco 2 p (where p 
is the bead’s distance from the axis). Use the equation of motion to discuss the 
stability of the equilibrium positions. 

Our first task is to write down the Lagrangian. Relative to a nonrotating 
frame, the bead has velocity RO tangential to the hoop and poo — (R sin 6)oo 
normal to the hoop (the latter due to the spinning of the hoop with angular 
velocity oo). Thus the kinetic energy is T = \mv 2 = \mR 2 (Q 2 + oo 2 sin 2 9). 
The gravitational potential energy is easily seen to be U = mgR(l — cos6), 
measured from the bottom of the hoop. Therefore, the Lagrangian is 

£= |m R 2 (0 2 + oo 2 sin 2 9)- mgR(\ - cos 0), (7.68) 

and the Lagrange equation is J 

— _ d_ 9L or m R 2 o) 2 sin 0 cos 6 — mgR sin 9 = mR 2 9. 
dd dt 30 

Dividing through by mR 2 , we arrive at the desired equation of motion: 

0 = (o) 2 cos0 — g/R)sin6. (7.69) j 

Although this equation cannot be solved analytically in terms of elementary 
functions, it can, nevertheless, tell us lots about the system’s behavior. To 
illustrate this, let us use (7.69) to find the equilibrium positions of the bead. 

An equilibrium point is any value of 6 — call it 9 0 — satisfying the following 
condition: If the bead is placed at rest (6 = 0) at 0 = 0 o , then it will remain at 
rest at 9 0 . This condition is guaranteed if 0 = 0. (To see this, note that if 6 = 0, 
then 0 doesn’t change and remains zero, which means that 6 doesn’t change 




262 Chapter 7 Lagrange’s Equations 


and remains equal to 0 o .) Thus to find the equilibrium positions we have only to 
equate the right side of (7.69) to zero: 

(w 2 cos# - g/R) sin# = 0. (7.70) 

This equation is satisfied if either of the two factors is zero. The factor sin 0 is 
zero if 0 = 0 or tt. Thus the bead can remain at rest at the bottom or top of the 
hoop. The first factor in (7.70) vanishes when 



Since | cos#| must be less than or equal to 1, the first factor can vanish only 
when co 2 > g/R. When this condition is satisfied, there are two more equilibrium 
positions at 

#„ = ± arccos ( —). (7.71) 

\(o 2 rJ 

We conclude that when the hoop is rotating slowly (co 2 < g/R), there are just 
two equilibrium positions, at the bottom and top of the hoop, but when it rotates 
fast enough (co 2 > g/R), there are two more, symmetrically placed on either 
side of the bottom, as given by (7.71). 10 

Perhaps the simplest way to understand the various equilibrium positions is 
in terms of the “centrifugal force.” In most introductory physics courses, the 
centrifugal force is dismissed as an abomination to be avoided by all right- 
thinking physicists. As long as we confine our attention to inertial frames, this 
is a correct (and certainly a safe) point of view. Nevertheless, as we shall see 
in Chapter 9, from the point of view of a noninertial rotating frame there is a 
perfectly real centrifugal force mco 2 p (perhaps more familiar as mv 2 /p), where 
p is the object’s distance from the axis of rotation. Thus, taking the point of 
view of a fly perched on the rotating hoop, we can understand the equilibrium 
positions as follows: At the bottom or top of the hoop, the bead is on the axis of 
rotation and p = 0; therefore, the centrifugal force mco 2 p is zero. Furthermore, 
the force of gravity is normal to the hoop, so there is no force tending to move 
the bead along the wire and the bead remains at rest. The other two equilibrium 
points are a little subtler: At any position off the axis (such as that shown in 
Figure 7.9) the centrifugal force is nonzero and has a component pushing the 
bead outward along the wire; meanwhile the force of gravity has a component 
pulling the bead inward along the wire (provided the bead is below the halfway 
marks, 9 = ±7r/2). At either of the points given by (7.71), these two components 
are balanced (check this for yourself—Problem 7.28) and the bead can remain 
at rest. 

An equilibrium point 0 O is not especially interesting unless it is stable — that 
is, the bead, if nudged a little away from 9 0 , moves back toward 9 0 . Using our 


10 Notice that when co 2 = g/R the two extra positions given by (7.71) have just come into 
existence and coincide with the first point at the bottom with 6 = ±0. 
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equation of motion (7.69), we can easily address this issue, and I’ll start with 
the equilibrium at the bottom, # = 0. As long as # remains close to 0, we can 
set cos # « 1 and sin # # and approximate the equation by 

9 = (co 2 - g/R)9 [9 near 0]. (7.72) 

If the hoop is rotating slowly (co 2 < g/R), this has the form 
9 = (negative number)#. 

If we nudge the bead to the right (9 > 0), then since 9 is positive 9 is negative, 
and the bead accelerates to the left, that is, back toward the bottom. If we nudge 
it to the left, (9 < 0), then 9 becomes positive, and the bead accelerates to the 
right, which is again back toward the bottom. Either way, the bead returns toward 
the equilibrium, which is, therefore, stable. 

If we speed up the rotation of the hoop, so that co 2 > g/R, then the approxi¬ 
mate equation of motion (7.72) takes the form 

9 = (positive number)#. 

Now a small displacement to the right makes # positive, and the bead accelerates 
away from the bottom. Similarly a displacement to the left makes 9 negative, 
and again the bead accelerates away from the bottom. Thus, as we increase co 
past the critical value where co 2 = g/R, the equilibrium at the bottom changes 
from stable to unstable. 

The equilibrium at the top (# = jt) is alway unstable (see Problem 7.28). 
This is easy to understand from our discussion of the centrifugal force. Near the 
top of the hoop, both the centrifugal and gravitational forces tend to push the 
bead away from the top, so there is no chance of a restoring force to pull it back 
to the equilibrium position. 

The other two equilibrium positions only exist when co 2 > g/R, and are 
easily seen to be stable: The equation of motion (7.69) is 

9 = (co 2 cos# - g/R ) sin#. (7.73) 

To be definite, let us consider the equilibrium on the right with 0 < # < n/2. At 
the equilibrium point, the term in parenthesis on the right of (7.73) is zero, while 
sin # is positive. If we increase # a little (bead moves up and to the right), sin # 
remains positive, but the term in parenthesis becomes negative. (Remember, 
cos # is a decreasing function in this quadrant.) Thus # becomes negative, and 
the bead accelerates back toward its equilibrium point. If we decrease # a little 
from the equilibrium, then # becomes positive, and again the bead accelerates 
back toward equilibrium. Therefore the equilibrium on the right is stable. As you 
would expect, a similar analysis shows that the same is true of the equilibrium 
on the left. 

We arrive at the following interesting story: When the hoop is rotating slowly 
(co 2 < g/R), there is just one stable equilibrium, at # = 0. If we speed up 
the rotation, then as co passes the critical value where co 2 = g/R, this original 
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equilibrium becomes unstable, but two new stable equilibrium points appear, 
emerging from 6=0 and moving out to the right and left as we increase co 
still more. This phenomenon — the disappearance of one stable equilibrium and 
the simultaneous appearance of two others diverging from the same point — is 
called a bifurcation and will be one of our principal topics in Chapter 12 on 
chaos theory. 

It is interesting to note that the device of this example was used by James 
Watt (1736-1829) as a governor for his steam engines. The device rotated with 
the engine, and as the engine sped up the bead rose on the hoop. When the 
angular velocity co reached some predetermined maximum allowable value, the 
bead, arriving at a corresponding height, caused the supply of steam to be shut 
off. 


This example illustrates another strength of the Lagrangian method that was 
mentioned back in Section 7.1: The generalized coordinates can even be coordinates 
relative to a noninertial reference frame, just as long as the frame in which the 
Lagrangian -C = T — U was orginally written was inertial. In this example, the angle 
6 was the polar angle of the bead, measured in the noninertial rotating frame of the 
hoop, but the Lagrangian (7.68) was defined as L = T — U with T and U evaluated 
in the inertial frame relative to which the hoop rotates. 11 

In the next and final example of this section, we pursue the previous example of 
the bead on the rotating hoop, and obtain approximate solutions of the equation of 
motion in the neighborhood of the stable equilibrium points. 


example 7.7 Oscillations of the Bead near Equilibrium 

j Consider again the bead of the previous example and use the equation of motion 
j to find the bead’s approximate behavior in the neighborhood of the stable 
j equilibrium positions. 

j When co 2 < g/R, the only stable equilibrium is at the bottom of the hoop 
j with 6 = 0. As long as 6 remains small, we can approximate the equation of 
j motion (7.73) by setting sin 6 6 and cos 6 1 to give 

6 = —(g/R — co 2 )0 [0 near 0] 

= -£2 2 0 (7.74) 

where the second line introduces the frequency 
£2 = y/g/R - co 2 . 

As long as co 2 < g/R, this defines £2 as a real positive number, and we recog¬ 
nize (7.74) as the equation for simple harmonic motion with frequency £2. We 


11 Example 7.5 was another instance: The coordinate q x gave the position of the block relative 
to the accelerating frame of the wedge, but the kinetic energy T was evaluated in the inertial frame 
of the table. For another example, see Problem 7.30. 
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conclude that a bead which is displaced a little from the stable equilibrium at 
6=0, executes harmonic motion with frequency Q, 

6(t) = Acos(Qt-8). (7.75) 

If we speed up the rate of the hoop’s rotation until co 2 > g/R, then Q, becomes 
pure imaginary, and, since cos ia = cosh a, our solution (7.75) becomes a hy¬ 
perbolic cosine, which grows with time, correctly reflecting that the equilibrium 
at 6 = 0 has become unstable. 

Once co 2 > g/R, there are two stable equilibrium positions given by (7.71) 
and located symmetrically to the right and left of the bottom of the hoop. As 
you might expect, these behave in the same way, and to be definite I shall focus 
on the one on the right. Let us denote its position by 6 = 6 0 , where, according 
to (7.70), 6 0 satisfies 

co 2 cos6 0 - g/R = 0. (7.76) 

Let us now imagine the bead placed close to 6 0 at 
6 = 6 0 + e 

and investigate the time dependence of the small parameter e. Once again we 
can approximate the equation of motion (7.73), though this requires more care. 
If we approximate the factors of cos(6 0 + e) and sin(0 o + e) by the first two 
terms of their Taylor series, 

cos(<9 0 + e) as cos 0 O - e sin 0 O and sin(0 o + e) ^ sin 6 0 + e cos 6 0 (7.77) 
then the equation of motion (7.73) becomes 

9 = [ft) 2 cos(0 o + e) — g/R] sin(0 o + e) [6 near 0 O ] 

= [co 2 cos 6 0 — €(o 2 sin 6 0 - g/R ] [sin d 0 + € cos 9 0 ]. (7.78) 

By (7.76) the first and third terms in the first square bracket cancel, leaving just 
the middle term —e co 2 sin 6 0 . To lowest order in c we can drop the second term 
of the second bracket, and, since 6 is the same as e, we are left with 

| := -e ft) 2 sin 2 6 0 = (7.79) 

Here the second equality defines the frequency Q.' = co sin 9 0 , or, using (7.76), 



(see Problem 7.26). Equation (7.79) is the equation for simple harmonic motion. 
Therefore, the parameter e oscillates about zero, and the bead itself oscillates 
about the equilibrium position 9 0 with frequency £2'. 
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7.6 Generalized Momenta and Ignorable Coordinates 


As I have already mentioned, for any system with n generalized coordinates q t 
(i = !,•••, n ), we refer to the n quantities dL/dq { — F { as generalized forces and 
9£ /dq t = p t as generalized momenta. With this terminology, the Lagrange equation. 


9£ _ d_d£ 
dq t dt 9 q { ’ 


can be rewritten as 


dt 


(7.81) 


(7.82) 


That is, “generalized force = rate of change of generalized momentum.” In particular, 
if the Lagrangian is independent of a particular coordinate q t , then F l = 9£/9g, = 0 
and the corresponding generalized momentum p ( is constant. 

Consider, for example, a single projectile subject only to the force of gravity. 
The potential energy is U = mgz (if we use Cartesian coordinates with z measured 
vertically up), and the Lagrangian is 


£ = £(x, y, z, x, y, z) = \m(x 2 + y 2 + z 2 ) — mgz. (7.83) 


With respect to Cartesian coordinates, the generalized force is just the usual force 
(9 £/dx = —dU/dx = F x , etc.) and the generalized momentum is just the usual mo¬ 
mentum (9£/9x = mx = p x , etc.) Because £ is independent of x and y, it immedi¬ 
ately follows that the components p x and p y are constant, as we already knew. 

In general, the generalized forces and momenta are not the same as the usual 
forces and momenta. For instance, we saw in Equations (7.25) and (7.26) that in two- 
dimensional polar coordinates the f component of the generalized force is the torque, 
and that of the generalized momentum is actually the angular momentum. In any case, 
when the Lagrangian is independent of a coordinate q t the corresponding generalized 
momentum is conserved. Thus, if the Lagrangian of a two-dimensional particle is 
independent of <j>, then the particle’s angular momentum is conserved — another im¬ 
portant result (and one that is clear from the Newtonian perspective as well). When 
the Lagrangian is independent of a coordinate q h that coordinate is sometimes said 
to be ignorable or cyclic. Obviously it is a good idea, when possible, to choose co¬ 
ordinates so that as many as possible are ignorable and their corresponding momenta 
are constant. In fact, this is perhaps the main criterion in choosing generalized coor¬ 
dinates for any given problem: Try to find coordinates as many as possible of which 
are ignorable. 

We can rephrase the result of the last three paragraphs by noting that the statement 
“£ is independent of a coordinate q” is equivalent to saying “£ is unchanged, or 
invariant , when q t varies (with all the other qj held fixed).” Thus we can say that if 
£ is invariant under variations of a coordinate q t then the corresponding generalized 
momentum p t is conserved. This connection between invariance of £ and certain 
conservation laws is the first of several similar results relating invariance under 
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transformations (translations, rotations, and so on) to conservation laws. These results 
are known collectively as Noether’s theorem, after the German mathematician Emmy 
Noether (1882-1935). I shall return to this important theorem in Section 7.8. 


7.7 Conclusion 


The Lagrangian version of classical mechanics has the two great advantages that, 
unlike the Newtonian version, it works equally well in all coordinate systems and 
it can handle constrained systems easily, avoiding any need to discuss the forces of 
constraint. If the system is constrained, one must choose a suitable set of independent 
generalized coordinates. Whether or not there are constraints, the next task is to write 
down the Lagrangian L in terms of the chosen coordinates. The equations of motion 
then follow automatically in the standard form 

dL d dL r . 1 

—- =-[z = 1, • • •, n]. 

dq t dt dq t 

There is, of course, no guarantee that the resulting equations will be easy to solve, 
and in most real problems they are not, requiring numerical solution or at least some 
approximations before they can be solved analytically. 

Even in problems that are only moderately complicated, like the examples of 
Section 7.5, finding the equations of motion by Lagrange’s method is remarkably 
easier than by using Newton’s second law. Indeed, some purists object that the 
Lagrangian approach makes life too easy, removing the need to think about the 
physics. 

The Lagrangian formalism can be extended to include more general systems than 
those considered so far. One important case is that of magnetic forces, which I take up 
in Section 7.9. Dissipative forces, such as friction or air resistance, can sometimes be 
included, but it should be admitted that the Lagrangian formalism is primarily suited 
to problems where dissipative forces are absent or, at least, negligible. 

The final three sections of this chapter treat three advanced topics, all of which 
are centrally important in Lagrangian mechanics, but all of which could be omitted 
on a first reading. In Section 7.8, I give two more examples of the remarkable 
connection between invariance under certain transformations and conservation laws. 
This connection, known as Noether’s theorem, is important in all of modem physics, 
but especially in quantum physics. Section 7.9 discusses how to include magnetic 
forces in Lagrangian mechanics, another topic of great importance in quantum theory. 
Finally, Section 7.10 introduces the method of Lagrange multipliers. This technique 
appears in many different guises in many areas of physics, but I shall restrict myself to 
some simple examples in Lagrangian mechanics. These last three sections are arranged 
to be self-contained and independent. You could study all of them, none of them, or 
any selection in between. 
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7.8 More about Conservation Laws* 


* The material of this section is more advanced than the preceding sections, and you should 
feel free to omit it on a first reading. Be aware, however, that the material discussed here is 
needed before you read Section 11.5 and Chapter 13. 

In this section I shall discuss how the laws of conservation of momentum and energy 
fit into the Lagrangian formulation of mechanics. Since we derived the Lagrangian 
formulation from the Newtonian, anything that we already knew about conservation 
laws, based on Newtonian mechanics, will naturally still be true in Lagrangian me¬ 
chanics. Nevertheless, we can gain some new insights by examining the conservation 
laws from a Lagrangian perspective. Furthermore, much modem work takes the La¬ 
grangian formulation (based on Hamilton’s principle, for example) as its starting point. 
In this context, it is important to know what can be said about conservation laws strictly 
within the Lagrangian framework. 


Conservation of Total Momentum 

We already know from Newtonian mechanics that the total momentum of an isolated 
system of N particles is conserved, but let us examine this important property from the 
Lagrangian point of view. One of the most prominent features of an isolated system 
is that it is translationally invariant; that is, if we transport all N particles bodily 
through the same displacement €, nothing physically significant about the system 
should change. This is illustrated in Figure 7.10, where we see that the effect of moving 
the whole system through the fixed displacement € is to replace every position r a by 
r a + €, 


ri-> r i+e, r 2 -> r 2 + e, •••, r N ->r N + €. (7.84) 

In particular, the potential energy must be unaffected by this displacement, so that 
U( rj + €,•••, r N + e,t) = U(r h • • •, r N ,t) (7.85) 



Figure 7.10 An isolated system of N particles is translationally 
invariant, which means that when every particle is transported 
through the same displacement e, nothing physically significant 
changes. 
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or, more briefly, 


8U = 0 


where 8U denotes the change in U under the translation (7.84). Clearly the velocities 
are unchanged by the translation (7.84). (Adding a constant € to all the r a doesn’t 
change the r a .) Therefore 8T = 0, and hence 


<$£ =0 


(7.86) 


under the translation (7.84). This result is true for any displacement e. If we choose € 
to be an infinitesimal displacement in the x direction, then all of the x coordinates 
x 1? • • ■, x N increase by e, while the y and z coordinates are unchanged. For this 
translation, the change in £ is 


8L =e 


9£ a£ 

-—I-1- 1 -— 

aXi[ OXjy 


= 0 . 


This implies that 


T — = o. 

, 9x„ 


Now using Lagrange’s equations we can rewrite each derivative as 


9£ _ d_d ,£ _ d_ 
dx a dt dx a dt^ ax 


(7.87) 


where p ax is the x component of the momentum of particle a. Thus (7.87) becomes 



where P x is the x component of the total momentum P = Pa By choosing the 
small displacement e successively in the y and z directions, we can prove the same 
result for the y and z components, and we reach the conclusion that, provided the 
Lagrangian is unchanged by the translation (7.84), the total momentum of the N- 
particle system is conserved. This connection between translational invariance of £ 
and conservation of total momentum is another example of Noether’s theorem. 


Conservation of Energy 

Finally, I would like to discuss conservation of energy from the Lagrangian point of 
view. The analysis proves somewhat complicated, but introduces a number of ideas 
that are important in more advanced work, particularly in the Hamiltonian formulation 
of mechanics (Chapter 13). 

As time advances the function £ (q h ■ • ■, q n , q h • ■ ■, q n , t) changes, both because 
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t is changing, and because the q’s and q’s change with the evolving system. Thus, by 
the chain rule, 


dt 


. ^—\ diCi . \—■v 3-C .. 3£ 

■«<••'*** Is + T«i + —• 


' Hi 


(7.88) 


Now, by Lagrange’s equation, I can replace the derivative in the first sum on the right 
by 

3£_d_3£__d _ . 

3 dt 3 q t dt 

Meanwhile, the derivative in the second sum on the right of (7.88) is just the general¬ 
ized momentum p t . Thus, I can rewrite (7.88) as 




Now, for many interesting systems, the Lagrangian does not depend explicitly on 
time; that is, 3 £/dt = 0. When this is the case, the second term on the right of (7.89) 
vanishes. If we move the left side of (7.89) over to the right, we see that the time 
derivative of the quantity P Ai — £ is zero. This quantity is so important that it has 
its own symbol, 


Jf = X>«i-£. 


(7.90) 


and is called the Hamiltonian of the system. With this terminology, we can state the 
following important conclusion: 


If the Lagrangian £ does not depend explicitly on time (that is, 3 L/dt = 0), 
then the Hamiltonian Oi is conserved. 


The discovery of any conservation law is a momentous event and is enough to justify 
saying that the Hamiltonian is an important quantity. In fact, it goes much further than 
this. As we shall see in Chapter 13, the Hamiltonian 3i is the basis of the Hamiltonian 
formulation of mechanics, in just the same way that £ is the basis of Lagrangian 
mechanics. 

For the moment, the chief importance of our newly discovered Hamiltonian is that 
in many situations it is in fact just the total energy of the system. Specifically, we shall 
prove that, provided the relation between the generalized coordinates and Cartesians 
is time-independent. 


r a = r a (qi, ■■■,q n ). 


(7.91) 
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the Hamiltonian Tf is just the total energy, 

% = T + U. (7.92) 

You may recall that we agreed in Section 7.3 to describe generalized coordinates 
that satisfy (7.91) as natural, thus, we can paraphrase the result (7.92) to say that, 
provided the generalized coordinates are natural, "K is just the total energy T + U. 
To prove this, let us express the total kinetic energy T — \ m a*a in terms of the 
generalized coordinates q h ■ • •, q n . First, differentiating (7.91) with respect to t and 
using the chain rule, we find that 12 


. v—' SFq, . 
r a = ) —q=. 
dq s 

If we now form the scalar product of this equation with itself, we find 


(7.93) 




where I have renamed the summation indices as j and k to avoid future confusion. 
The kinetic energy is now given as a triple sum, which I can reorganize and write as 13 


“ M 


where A j k is shorthand for the sum 


(7.94) 


A jk = A jk (q lt • • •, q n ) 



(7.95) 


We can now evaluate the generalized momentum p { by differentiating (7.94) with 
respect to q t (Problem 7.45), 


dL _ dT 

d(Ji 3 fa 


T, a <jij- 

3 


(7.96) 


Returning to Equation (7.90) for the Hamiltonian, we can rewrite the sum on the 
right as 


J2 Pin = Y, (5Z A u9j)n = J2 A iMj = 2T 


(7.97) 


12 If the relation (7.91) were explicitly time-dependent there would be one extra term in this 
expression for r a , namely dr a /dt. This extra term would invalidate the conclusion (7.98) below that 
X = T + U. 

13 We can restate the result (7.94) to say that, provided the generalized coordinates are natural, 
the kinetic energy T is a homogeneous quadratic function of the generalized velocities q t . This result 
plays an important role in several later developements. See, for instance, Section 11.5. 
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where the last step follows from (7.94). Therefore 

Tf = PM ~ £ = 2 T -(T -U) = T + U. (7.98) 

That is, provided the transformation between the Cartesian and generalized coordi¬ 
nates is time-independent, as in (7.91), the Hamiltonian Tf is just the total energy of 
the system. 

I have already proved that, provided the Lagrangian is independent of time, the 
Hamiltonian is conserved. Thus we now see that time independence of the Lagrangian 
[together with the condition (7.91)] implies conservation of energy. We can rephrase 
the time independence of £ by saying that £ is unchanged by translations of time, 
t -m 4- e. Thus the result we have just proved is that invariance of £ under time 
translations is related to energy conservation, in much the same way that invariance 
of £ under translations of space (r ->• r + () is related to conservation of momentum. 
Both results are manifestations of Noether’s famous theorem. 


7.9 Lagrange’s Equations for Magnetic Forces* 


* This section requires a knowledge of the scalar and vector potentials of electromagnetism. 
Although the ideas described here play an important role in the quantum-mechanical treatment 
of magnetic fields, they will not be used again in this book. 

Although I have so far consistently defined the Lagrangian as £ = T — £/, there are 
systems, such as a charged particle in a magnetic field, which can be treated by the 
Lagrangian method, but for which £ is not just T — U. The natural question to ask 
is then: What is the definition of the Lagrangian for such systems? This is the first 
question I address. 


Definition and Nonuniqueness of the Lagrangian 

Probably the most satisfactory general definition of a Lagrangian for a mechanical 
system is this: 


General Definition of a Lagrangian 

For a given mechanical system with generalized coordinates q~(qi>- > #„)> 
a Lagrangian £ is a function £(# f , * • •, q n , q h • • * , q n , t) of the coordinates 
and velocities, such that the correct equations of motion for the system are the ' 
Lagrange equations 


3£ _ 
% 


£ 0 £ 

dt dqi 


= V- 




In other words, a Lagrangian is any function £ for which Lagrange’s equations are 
true for the system under consideration. 
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Obviously, for the systems that we have discussed so far, the old definition £ = 
T — U fits this new definition. But the new definition is much more general. In 
particular, it is easy to see that our new definition does not define a unique Lagrangian 
function. For example, consider a single particle in one dimension and suppose that 
we have found a Lagrangian £ for this particle. That is, the equation of motion of the 
particle is 


3£ _ d_ 9£ 
dx dt dx 


(7.99) 


Now let f(x,x) be any function for which 


dx dt dx 


(7.100) 


(It is easy to think up such a function, for instance, f = xx.) If we replace £ in 
(7.99) by 

£' = £ + / 


then, by virtue of (7.100), the Lagrangian £' gives exactly the same equation of motion 
as £. 

The lack of uniqueness of the Lagrangian is similar to, though more radical than, 
the familiar lack of uniqueness in the potential energy (to which one can add any 
constant without changing any of the physical predictions). The crucial point is that 
any function £ which gives the right equation of motion has all of the features that we 
require of a Lagrangian (for instance, that the integral / £ dt is stationary at the right 
path) and so is just as acceptable as any other such function £. If, for a given system, 
we can spot a function £ that leads to the right equation of motion, then we don’t 
need to debate whether it is the “right” Lagrangian — if it gives the right equation of 
motion, then it is just as right as any other conceivable Lagrangian. 


Lagrangian for a Charge in a Magnetic Field 

Consider now a particle (mass m and charge q) moving in electric and magnetic fields 
E and B. The force on the particle is the well-known Lorentz force F = q (E + v x B), 
so Newton’s second law reads 

mr = q(E + r x B). (7.101) 

To reformulate (7.101) in Lagrangian form, we have only to spot a function £ for 
which the three Lagrange equations are the same as (7.101). This can be done using 
the scalar and vector potentials, V (r, t) and A(r, t), in terms of which the two fields 
can be written 14 


E = — VV-and B = V x A. 


(7.102) 


14 See, for example, David J. Griffiths, Introduction to Electrodynamics, (Prentice-Hall, 1999), 
p. 416-417. 
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I now claim that the Lagrangian function 15 


£(r, r, t) = \mr 2 - q(V - f • A) (7.103) 

= \m(x 2 + y 2 + z 2 ) - q(V - xA x - yA y - zA z ) (7.104) 


has the desired property, that it reproduces Newton’s second law (7.101). To check 
this, let us examine the first of the three Lagrange equations, 


3L _ d_dL 
dx dt dx 


(7.105) 


To see what this implies we have to evaluate the two derivatives of the proposed 
Lagrangian (7.104): 


/dV .dA x .dA y . 3A Z \ 

dx q V dx dx ^ dx dx J 


(7.106) 


and 

— = mx + qA x . 
dx 

When we differentiate this with respect to t, we must remember that A x — 
A x (x, y, z, t). As t varies, x, y, z move with the particle and, by the chain rule, we 
find 


d dL 

-— - mx + q 

dt dx 


(i 3 A± 
V dx 


.3 A x 

+ y~r~ 

3 y 


a 4 

+ —- ) ■ (7.107) 

dt 


Substituting (7.106) and (7.107) into (7.105), cancelling the two terms in x, and 
rearranging, we find that Lagrange’s equation (the x component, with the proposed 
Lagrangian) is the same as 


mx = —q 






(7.108) 


or, according to (7.102), 


mx = q(E x + yB z - zB y ) 


(7.109) 


which you will recognize as the x component of Newton’s second law (7.101). Since 
the y and z components work in the same way, we conclude that Lagrange’s equations, 
with the proposed Lagrangian (7.104), are exactly equivalent to Newton’s second law 
for the charged particle. That is, we have successfully recast Newton’s second law for 
a charged particle into Lagrangian form with the Lagrangian (7.104). 

Using the Lagrangian (7.104), one can solve various problems involving charged 
particles in electric and magnetic fields. (See Problem 7.49 for an example.) Theoret- 


15 Notice that you can, if you wish, write this as £ = T — U, but U is certainly not the usual 
PE, since it depends on the velocity r; U is sometimes called a “velocity-dependent PE,” but notice 
that it is not true that the force on the charge is — Vt/. 
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ically, the most important conclusion of this analysis emerges when we evaluate the 
generalized momentum. For example, 

Px = — = m x+qA x . 
dx 

Since the y and z components work the same way, we conclude that 

(generalized momentum, p) = mv + qA. (7.110) 

That is, the generalized momentum is the mechanical momentum mv plus a magnetic 
term qA. This result is at the heart of the quantum theory of a charged particle in 
a magnetic field, where it turns out that the generalized momentum corresponds to 
the differential operator —ih\ (where Ti is Planck’s constant over 2tv), so that the 
quantum analog of the mechanical momentum mv is the operator —iHV — qA. 


7.10 Lagrange Multipliers and Constraint Forces* 


* The method of Lagrange multipliers is used in many areas of physics. Nevertheless, we shan ’t 
be using it again in this book, and you could omit this section without any loss of continuity. 

In this section, we discuss the method of Lagrange multipliers. This powerful method 
finds application in several areas of physics and takes on quite different appearances 
in different contexts. Here I shall treat only its application to Lagrangian mechanics, 16 
and to keep the analysis simple I shall restrict the discussion to two-dimensional 
systems with just one degree of freedom. 

We have seen that one of the strengths of Lagrangian mechanics is that it can bypass 
all of the forces of constraint. However, there are situations where one actually needs 
to know these forces. For example, the designer of a roller coaster needs to know the 
normal force of the track on the car to know how strong to build the track. In this 
case, we can still use a modified form of Lagrange’s equations, but the procedure is 
somewhat different: We do not choose generalized coordinates q\,---,q n all of which 
can be independently varied. (Remember that it was the independence of q h ■ ■ ■ ,q n 
that let us use the standard Lagrange equations without worrying about constraints.) 
Instead we use a larger number of coordinates and use Lagrange multipliers to handle 
the constraints. 

To illustrate this procedure, we’ll consider a system with just two rectangular 
coordinates x and y, which are restricted by a constraint equation of the form 17 

f(x,y) = const. (7.111) 

For example, we could consider a simple pendulum with just one degree of freedom 
(as in Figure 7.2). In treating this by the standard Lagrange approach one would use 


16 For applications to other kinds of problems, see, for example, Mathematical Methods in the 
Physical Sciences by Mary Boas (Wiley, 1983), Ch. 4, Section 9 and Ch. 9, Section 6. 

17 We’ll see directly by example that some typical constraints can be put in this form. In fact, it 
is fairly easy to show that any holonomic constraints can be. 
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the one generalized coordinate 0, the angle between the pendulum and the vertical, 
and avoid any discussion of the constraints. If, instead, we choose to use the original 
rectangular coordinates x and y, then we must recognize that these coordinates are 
not independent; they satisfy the constraint equation 

/0t,.v) = v / * 2 + y 2 = l 

where l is the length of the pendulum. We shall find that the method of Lagrange 
multipliers lets us accomodate this constraint, determine the time dependence of x and 
y, and find the tension in the rod. As a second example, consider the Atwood machine 
of Figure 7.6. In our previous treatment we used the one generalized coordinate x, 
the position of the mass m h but we could instead use both coordinates x and y (the 
positions of both masses), provided we remember that the constancy of the string’s 
length imposes the constraint that 

f(x,y)=x + y = const. 

Here too, Lagrange multipliers will let us accomodate this constraint, solve for the 
time dependence of x and y, and find the constraint force, which is here the tension 
in the string. 

To set up our new method we start from Hamilton’s principle. Our Lagrangian has 
the form L (x, x, y, y). (We could allow it to have an explicit dependence on t as well, 
but to simplify notation I shall assume it doesn’t.) The proof of Hamilton’s principle 
given in Section 7.4 applies even when the coordinates are constrained. (Indeed it 
was designed to allow for constraints.) Thus we can conclude as before that the action 
integral 


5= f 2 £(x, x, y, y) dt (7.112) 

t\ 

is stationary when taken along the actual path followed. If we denote this “right” path 
by x(t), y(t) and imagine a small displacement to a neighboring “wrong” path, 


x(t) -* x(t) + Sx(t ) ] 
y(t) y(t) + 8y(t ) J 


(7.113) 


then, provided the displacement is consistent with the constraint equation, the action 
integral (7.112) is unchanged, 8S = 0. To exploit this, we must write 8S in terms of 
the small displacements Sx and Sy: 


■/( 


—ox -I-ox H-oy H—— 8y 

dx dx dy d y 


^ dt. 


(7.114) 


The second and fourth terms can be integrated by parts, and we conclude that 

M = f (w _ £3M /•/££_ lat'i = o (7 , 15) 

J \ dx dt dx J J \dy dt dy J 


for any displacements 8x and Sy consistent with the constraints. 
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If (7.115) were true for any displacements, we could immediately prove two 
separate Lagrange equations, one for x and one for y. (Choosing 8y = 0, we would be 
left with just the first integral; since this has to vanish for any choice of 8x, the factor 
in parentheses would have to be zero, which implies the usual Lagrange equation with 
respect to x. And similarly for y.) This is exactly the correct conclusion for the case 
that there are no constraints. 

However, there are constraints, and (7.115) is only true for displacements 8x and 
8y consistent with the constraints. Therefore, we proceed as follows: Since all points 
with which we are concerned satisfy /(x, y) <A const, the displacement (7.113) leaves 
f(x,y ) unchanged, so 


8f = J-8x + —8y = 0 (7.116) 

dx dy ' 

for any displacement consistent with the constraints. Since this is zero, we can multiply 
it by an arbitrary function X(t) — this is the Lagrange multiplier — and add it to 
the integrand in (7.115), without changing the value of the integral (namely zero). 
Therefore 


8S = 


+ 


/( 

/( 


dx dx dt dx) 

dL , a/ d dL\ 

- +X(t)— - 

dy dy dt dy J 


8xdt 

Sydt = 0 


(7.117) 


for any displacement consistent with the constraints. Now comes the supreme cunning: 
So far X(t) is an arbitrary function of t, but we can choose it so that the coefficient of 
8x in the first integral is zero. That is, by choice of the multiplier A(t), we can arrange 
that 


df _ £BC 
9x Bx dt dx 


(7.118) 


along the actual path of the system. This is the first of two modified Lagrange equations 
and differs from the usual equation only by the extra term involving A on the left. With 
the multiplier chosen in this way, the whole first integral in (7.117) is zero. Therefore 
the second integral is also zero (since their sum is), and this is true for any choice 
of 8y. (The constraint places no restriction on <5x or Sy separately — it only fixes <$x 
once Sy is chosen, or vice versa.) Therefore the coefficient of Sy in this second integral 
must also be zero and we have a second modified Lagrange equation, 


ac +x £f = £a£ 

dy dy dt dy 


(7.119) 
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We now have two modified Lagrange equations for the two unknown functions 
x{t) and y(t). This elegant result has been bought at the price of introducing a third 
unknown function, the Lagrange multiplier X (t). To find three unknown functions, we 
need three equations, but fortunately a third equation is already at hand, the constraint 
equation 


f(x, y ) = const 


(7.120) 


The three equations (7.118), (7.119), and (7.120) are sufficient, in principle at least, to 
determine the coordinates x(t) and y(t) and the multiplier X(t). Before we illustrate 
this with an example, there is one more bit of theory to develop. 

So far the Lagrange multiplier X(t) is just a mathematical artifact, introduced to 
help us solve our problem. However, it turns out to be closely related to the forces of 
constraint. To see this, we have only to look more closely at the modified Lagrange 
equations (7.118) and (7.119). The Lagrangian of our present discussion has the 
form 


£ = \m x x 2 + \m 2 y 2 — U(x, y). 

(In a problem like the simple pendulum, x and y are the two coordinates of a single 
mass and mj = m 2 . In a problem like the Atwood machine, there are two separate 
masses and m x and m 2 are not necessarily equal.) Inserting this Lagrangian into 
(7.118), we find 


du df 

-1- a— = m ] X . 

3jc dx 


(7.121) 


Now, on the left side —dU/dx is the jc component of the nonconstraint force. (Re¬ 
member that U was defined as the potential energy of the nonconstraint forces.) 
On the right, m l x is the jc component of the total force, equal to the sum of 
the nonconstraint and constraint forces. Thus m\X = —dU/dx + F^ str . Canceling 
the term —dU/dx from both sides of (7.121), we reach the important conclusion 
that 


= F cstr (7.122) 

dx 

with a corresponding result for the y components. This then is the significance of the 
Lagrange multiplier: Multiplied by the appropriate partial derivatives of the constraint 
function f(x, y), the Lagrange multiplier X(t) gives the corresponding components 
of the constraint force. 

Let us now see how these ideas work in practice by using the formalism to analyse 
the example of the Atwood machine. 
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example 7.8 Atwood’s Machine Using a Lagrange Multiplier 

Analyze the Atwood machine of Figure 7.6 (shown again here as Figure 7.11) 
by the method of Lagrange multipliers and using the coordinates x and y of the 
two masses. 

In terms of the given coordinates, the Lagrangian is 

£ = T - U = \m l x 2 + \m 2 y 2 + m^gx + m 2 gy (7.123) 
and the constraint equation is 

f(x, y) = x + y = const. (7.124) 

The modified Lagrange equation (7.118) for x reads 

— +x d J- = - d — or m l g + X = m i x (7.125) 

dx dx dt dx 

and that for y is 

BL df d 9£ 

—+X—= —— or m 2 g + X = m 2 y. (7.126) 

ay ay dt ay 

These two equations, together with the constraint equation (7.124), are easily 
solved for the unknowns x(t), y(t), and X(t). From (7.124) we see that y = —x, 
and then subtracting (7.126) from (7.125) we can eliminate X and arrive at the 
same result as before, 

x = (mj - m 2 )g/(m t + m 2 ). 

To better understand the two modified Lagrange equations (7.125) and 
(7.126), it is helpful to compare them with the two equations of the Newtonian 
solution. Newton’s second law for m 1 is 

m l g — F t = m 1 x 



Figure 7.11 The Atwood machine again. 
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j where F t is the tension in the string, and that for m 2 is 
I m 2 g ~ F t = m 2 y. 

j These are precisely the two Lagrange equations (7.125) and (7.126), with the 
j Lagrange multiplier identified as the constraint force 

j a = —F t . 

[Two small comments: The minus sign occurs because both coordinates x and y 
j were measured downward, whereas both tension forces are upward. In general, 
j according to (7.122) the constraint force is Xdf/dx, but in this simple case, 

| df/dx = L] 

You can find some more examples of the use of Lagrange multipliers in Problems 
7.50 through 7.52. 


Principal Definitions and Equations of Chapter 7 _ 

The Lagrangian 

The Lagrangian £ of a conservative system is defined as 

L = T-U, [Eq. (7.3)] 

where T and U are respectively the kinetic and potential energies. 

Generalized Coordinates 

The n parameters q h ■ ■ •, q„ are generalized coordinates for an N -particle system if 
every particle’s position r a can be expressed as a function of q h • • •, q n (and possibly 
the time t) and vice versa, and if n is the smallest number that allows the system to 
be described in this way. [Eqs. (7.34) & (7.35)] 

If n < 3N (in three dimensions) the system is said to be constrained. The coor¬ 
dinates q x , ■ ■ •, q n are said to be natural if the functional relationships of the r a to 
q h ■ ■ ■, q n are independent of time. The number of degrees of freedom of a system 
is the number of coordinates that can be independently varied. If the number of de¬ 
grees of freedom is equal to the number of generalized coordinates (in some sense the 
“normal” state of affairs), the system is said to be holonomic. [Section 7.3] 

Lagrange’s Equations 

For any holonomic system, Newton’s second law is equivalent to the n Lagrange 
equations 

3£ _ d_d£ 
dq t dt dq t 


[i = 1, • • •, n] [Sections 7.3 & 7.4] 
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and the Lagrange equations are in turn equivalent to Hamilton’s principle — a fact we 
used only to prove the Lagrange equations. [Eq. (7.8)] 


Generalized Momenta and Ignorable Coordinates 

The z'th generalized momentum p t is defined to be the derivative 
9 £ 

Pi = TT- 
Hi 

If dL/dqi = 0, then we say the coordinate q t is ignorable and the corresponding 
generalized momentum is constant. [Section 7.6] 


The Hamiltonian 

The Hamiltonian is defined as 

K = J2p i Vi -' C - [Eq. (7.90)| 

If 9 £/dt — 0, then Of is conserved; if the coordinates q h ■ ■ •, q n are natural, “K is just 
the energy of the system. 


Lagrangian for a Charge in an Electromagnetic Field 

The Lagrangian for a charge q in an electromagnetic field is 

£(r, r, t ) = jmr 2 - q(V - r-A). [Eq. (7.103)] 


Problems for Chapter 7 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (*+*). 

section 7.1 Lagrange’s Equations for Unconstrained Motion 

7.1 * Write down the Lagrangian for a projectile (subject to no air resistance) in terms of its Cartesian 
coordinates (x,y,z), with z measured vertically upward. Find the three Lagrange equations and show 
that they are exactly what you would expect for the equations of motion. 

7.2 ★ Write down the Lagrangian for a one-dimensional particle moving along the x axis and subject 
to a force F = —kx (with k positive). Find the Lagrange equation of motion and solve it. 

7.3* Consider a mass m moving in two dimensions with potential energy U(x, y) = \kr 2 , where 
r 2 = x 2 + y 2 . Write down the Lagrangian, using coordinates x and y, and find the two Lagrange 
equations of motion. Describe their solutions. [This is the potential energy of an ion in an “ion trap,” 
which can be used to study the properties of individual atomic ions.] 
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7.4 * Consider a mass m moving in a frictionless plane that slopes at an angle a with the horizontal. 
Write down the Lagrangian in terms of coordinates x, measured horizontally across the slope, and y, 
measured down the slope. (Treat the system as two-dimensional, but include the gravitational potential 
energy.) Find the two Lagrange equations and show that they are what you should have expected. 

7.5 * Find the components of V/(r, 0) in two-dimensional polar coordinates. [Hint: Remember that 
the change in the scalar / as a result of an infinitesimal displacement dr is df = V/ • dr.\ 

7.6* Consider two particles moving unconstrained in three dimensions, with potential energy 
U(r h r 2 ). (a) Write down the six equations of motion obtained by applying Newton’s second law 
to each particle, (b) Write down the Lagrangian C(r l5 r 2 , r h r 2 ) = T — U and show that the six La¬ 
grange equations are the same as the six Newtonian equations of part (a). This establishes the validity of 
Lagrange’s equations in rectangular coordinates, which in turn establishes Hamilton’s principle. Since 
the latter is independent of coordinates, this proves Lagrange’s equations in any coordinate system. 

7.7 * Do Problem 7.6, but for N particles moving unconstrained in three dimensions (in which case 
there are 3 N equations of motion). 

7.8 ** (a) Write down the Lagrangian C(x 1? x 2 , Xj, i 2 ) for two particles of equal masses, m j = m 2 = m, 
confined to the x axis and connected by a spring with potential energy U = \kx 2 . [Here x is the 
extension of the spring, jc = (xj — x 2 — /), where / is the spring’s unstretched length, and I assume 
that mass 1 remains to the right of mass 2 at all times.] (b) Rewrite £ in terms of the new variables 
X = |(x t + x 2 ) (the CM position) and x (the extension), and write down the two Lagrange equations 
for X and x. (c) Solve for X ( t ) and x(t) and describe the motion. 

section 7.3 Constrained Systems in General 

7.9 * Consider a bead that is threaded on a rigid circular hoop of radius R lying in the xy plane with its 
center at O, and use the angle 0 of two-dimensional polar coordinates as the one generalized coordinate 
to describe the bead’s position. Write down the equations that give the Cartesian coordinates (x, y) in 
terms of 0 and the equation that gives the generalized coordinate 0 in terms of (x, y). 

7.10 * A particle is confined to move on the surface of a circular cone with its axis on the z axis, vertex at 
the origin (pointing down), and half-angle a. The particle’s position can be specified by two generalized 
coordinates, which you can choose to be the coordinates (p, 0) of cylindrical polar coordinates. Write 
down the equations that give the three Cartesian coordinates of the particle in terms of the generalized 
coordinates (p, 0) and vice versa. 

7.11 ★ Consider the pendulum of Figure 7.4, suspended inside a railroad car, but suppose that the car 
is oscillating back and forth, so that the point of suspension has position x s = A cos cot , y s = 0. Use the 
angle 0 as the generalized coordinate and write down the equations that give the Cartesian coordinates 
of the bob in terms of 0 and vice versa. 

section 7.4 Proof of Lagrange’s Equations with Constraints 

7.12 * Lagrange’s equations in the form discussed in this chapter hold only if the forces (at least the 
nonconstraint forces) are derivable from a potential energy. To get an idea how they can be modified 
to include forces like friction, consider the following: A single particle in one dimension is subject to 
various conservative forces (net conservative force = F = —dU/dx) and a nonconservative force (let’s 
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call it F frjc ). Define the Lagrangian as £ = T — U and show that the appropriate modification is 
dx fnc dt dx 

7.13 ** In Section 7.4 [Equations (7.41) through (7.51)], I proved Lagrange’s equations for a single 
particle constrained to move on a two-dimensional surface. Go through the same steps to prove 
Lagrange’s equations for a system consisting of two particles subject to various unspecified constraints. 
[Hint: The net force on particle 1 is the sum of the total constraint force Fj Str and the total nonconstraint 
force F 1? and likewise for particle 2. The constraint forces come in many guises (the normal force of a 
surface, the tension force of a string tied between the particles, etc.), but it is always true that the net 
work done by all constraint forces in any displacement consistent with the constraints is zero — this is 
the defining property of constraint forces. Meanwhile, we take for granted that the nonconstraint forces 
are derivable from a potential energy U (r b r 2 , t); that is, F, = — Vj U and likewise for particle 2. Write 
down the difference 8S between the action integral for the right path given by r,(t) and r 2 (t) and any 
nearby wrong path given by r^f) + *j(t) and r 2 (t) + £ 2 (0- Paralleling the steps of Section 7.4, you 
can show that 8S is given by an integral analogous to (7.49), and this is zero by the defining property 
of constraint forces.] 

section 7.5 Examples of Lagrange’s Equations 

7.14* Figure 7.12 shows a crude model of a yoyo. A massless string is suspended vertically from 
a fixed point and the other end is wrapped several times around a uniform cylinder of mass m and 
radius R. When the cylinder is released it moves vertically down, rotating as the string unwinds. 
Write down the Lagrangian, using the distance x as your generalized coordinate. Find the Lagrange 
equation of motion and show that the cylinder accelerates downward with x = 2g/3. [Hints: You need 
to remember from your introductory physics course that the total kinetic energy of a body like the yoyo 
is T = \mv 2 + \Ico 2 , where v is the velocity of the center of mass, I is the moment of inertia (for a 
unform cylinder, I — \mR 2 ) and on is the angular velocity about the CM. You can express co in terms 
of x.] 

7.15 * A mass m, rests on a frictionless horizontal table and is attached to a massless string. The string 
runs horizontally to the edge of the table, where it passes over a massless, frictionless pulley and then 
hangs vertically down. A second mass m 2 is now attached to the bottom end of the string. Write down 
the Lagrangian for the system. Find the Lagrange equation of motion, and solve it for the acceleration of 
the blocks. For your generalized coordinate, use the distance x of the second mass below the tabletop. 
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7.16* Write down the Lagrangian for a cylinder (mass m, radius R, and moment of inertia I) that 
rolls without slipping straight down an inclined plane which is at an angle a from the horizontal. Use 
as your generalized coordinate the cylinder’s distance x measured down the plane from its starting 
point. Write down the Lagrange equation and solve it for the cylinder’s acceleration x. Remember that 
T = \mv 2 + \Ico 2 , where v is the velocity of the center of mass and co is the angular velocity. 

7.17* Use the Lagrangian method to find the acceleration of the Atwood machine of Example 7.3 
(page 255) including the effect of the pulley’s having moment of inertia I. (The kinetic energy of the 
pulley is \Ico 2 , where co is its angular velocity.) 

7.18 * A mass m is suspended from a massless string, the other end of which is wrapped several times 
around a horizontal cylinder of radius R and moment of inertia I, which is free to rotate about a fixed 
horizontal axle. Using a suitable coordinate, set up the Lagrangian and the Lagrange equation of motion, 
and find the acceleration of the mass m. [The kinetic energy of the rotating cylinder is \lco 2 .] 

7.19 * In Example 7.5 (page 258) the two accelerations are given by Equations (7.66) and (7.67). 
Check that the acceleration of the block is given correctly in the limit M -* 0. [You need to find the 
components of this acceleration relative to the table.] 

7.20 ★ A smooth wire is bent into the shape of a helix, with cylindrical polar coordinates p = R and 
z = k<f>, where R and X are constants and the z axis is vertically up (and gravity vertically down). Using 
z as your generalized coordinate, write down the Lagrangian for a bead of mass m threaded on the wire. 
Find the Lagrange equation and hence the bead’s vertical acceleration z. In the limit that R -> 0, what 
is z? Does this make sense? 

7.21 * The center of a long frictionless rod is pivoted at the origin, and the rod is forced to rotate in 
a horizontal plane with constant angular velocity co. Write down the Lagrangian for a bead threaded 
on the rod, using r as your generalized coordinate, where r, 0 are the polar coordinates of the bead. 
(Notice that <f> is not an independent variable since it is fixed by the rotation of the rod to be 0 = cot.) 
Solve Lagrange’s equation for r(t). What happens if the bead is initially at rest at the origin? If it is 
released from any point r Q > 0, show that r(t) eventually grows exponentially. Explain your results in 
terms of the centrifugal force mco 2 r. 

7.22 * Using the usual angle 0 as generalized coordinate, write down the Lagrangian for a simple 
pendulum of length / suspended from the ceiling of an elevator that is accelerating upward with 
constant acceleration a. (Be careful when writing T; it is probably safest to write the bob’s velocity 
in component form.) Find the Lagrange equation of motion and show that it is the same as that for a 
normal, nonaccelerating pendulum, except that g has been replaced by g + a. In particular, the angular 
frequency of small oscillations is y/(g + a)/l. 

7.23 * A small cart (mass m) is mounted on rails inside a large cart. The two are attached by a spring 
(force constant k ) in such a way that the small cart is in equilibrium at the midpoint of the large. The 
distance of the small cart from its equilibrium is denoted x and that of the large one from a fixed 
point on the ground is X, as shown in Figure 7.13. The large cart is now forced to oscillate such that 
X = A cos cot, with both A and co fixed. Set up the Lagrangian for the motion of the small cart and 
show that the Lagrange equation has the form 

x 4- co 2 x = B cos cot 

where co 0 is the natural frequency co 0 = y/k/m and B is a constant. This is the form assumed in Section 
5.5, Equation (5.57), for driven oscillations (except that we are here ignoring damping). Thus the system 
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Figure 7.13 Problem 7.23 


described here would be one way to realize the motion discussed there. (We could fill the large cart 
with molasses to provide some damping.) 

7.24* We saw in Example 7.3 (page 255) that the acceleration of the Atwood machine is x = 
(m l — m 2 )g/(m l + m 2 ). It is sometimes claimed that this result is “obvious” because, it is said, the 
effective force on the system is (m, — m 2 )g and the effective mass is (m, + m 2 ). This is not, perhaps, 
all that obvious, but it does emerge very naturally in the Lagrangian approach. Recall that Lagrange’s 
equation can be thought of as [Equation (7.17)] 

(generalized force) = (rate of change of generalized momentum). 

Show that for the Atwood machine the generalized force is (m x — m 2 )g and the generalized momentum 
(m j + m 2 )i. Comment. 

7.25 * Prove that the potential energy of a central force F = —kr" r (with n ^ — 1) is U = kr n+l /(n + 
1). In particular, if n — \, then F = —kr and U = \kr 2 . 

7.26 * In Example 7.7 (page 264), we saw that the bead on a spinning hoop can make small oscillations 
about any of its stable equilibrium points. Verify that the oscillation frequency £2' defined in (7.79) is 
equal to y/co 2 — ( g/coR ) 2 as claimed in (7.80). 

7.27 *★ Consider a double Atwood machine constructed as follows: A mass 4m is suspended from a 
string that passes over a massless pulley on frictionless bearings. The other end of this string supports 
a second similar pulley, over which passes a second string supporting a mass of 3m at one end and m 
at the other. Using two suitable generalized coordinates, set up the Lagrangian and use the Lagrange 
equations to find the acceleration of the mass 4m when the system is released. Explain why the top 
pulley rotates even though it carries equal weights on each side. 

7.28 ★★ A couple of points need checking from Example 7.6 (page 260). (a) From the point of view of 
a noninertial frame rotating with the hoop, the bead is subject to the force of gravity and a centrifugal 
force ma> 2 p (in addition to the constraint force, which is the normal force of the wire). Verify that at the 
equilibrium points given by (7.71), the tangential components of these two forces balance one another. 
(A free-body diagram will help.) (b) Verify that the equilibrium point at the top (0 = n) is unstable, 
(c) Verify that the equilibrium at the second point given by (7.71) (the one on the left, with 6 negative) 
is stable. 

7.29 ★* Figure 7.14 shows a simple pendulum (mass m, length /) whose point of support P is attached 
to the edge of a wheel (center O, radius R) that is forced to rotate at a fixed angular velocity co. At 
t = 0, the point P is level with O on the right. Write down the Lagrangian and find the equation of 
motion for the angle (p. [Hint: Be careful writing down the kinetic energy T. A safe way to get the 
velocity right is to write down the position of the bob at time t, and then differentiate.] Check that your 
answer makes sense in the special case that co = 0. 
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7.30 ** Consider the pendulum of Figure 7.4, suspended inside a railroad car that is being forced to 
accelerate with a constant acceleration a. (a) Write down the Lagrangian for the system and the equation 
of motion for the angle 0. Use a trick similar to the one used in Equation (5.11) to write the combination 
of sin 0 and cos 0 as a multiple of sin(0 + fi). (b) Find the equilibrium angle 0 at which the pendulum 
can remain fixed (relative to the car) as the car accelerates. Use the equation of motion to show that 
this equilbrium is stable. What is the frequency of small oscillations about this equilibrium position? 
(We shall find a much slicker way to solve this problem in Chapter 9, but the Lagrangian method does 
give a straightforward route to the answer.) 

7.31 ** A simple pendulum (mass M and length L) is suspended from a cart (mass m) that can oscillate 
on the end of a spring of force constant k, as shown in Figure 7.15. (a) Write the Lagrangian in terms 
of the two generalized coordinates x and 0, where x is the extension of the spring from its equilibrium 
length. (Read the hint in Problem 7.29.) Find the two Lagrange equations. (Warning: They’re pretty 
ugly!) (b) Simplify the equations to the case that both x and 0 are small. (They’re still pretty ugly, and 
note, in particular, that they are still coupled.-, that is, each equation involves both variables. Nonetheless, 
we shall see how to solve these equations in Chapter 11 — see particularly Problem 11.19.) 
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Figure 7.15 Problem 7.31 


7.32 ** Consider the cube balanced on a cylinder as described in Example 4.7 (page 130). Assuming 
that b <r, use the Lagrangian approach to find the angular frequency of small oscillations about the 
top. The simplest procedure is to make the small-angle approximations to £ before you differentiate to 
get Lagrange’s equation. As usual, be careful in writing down the kinetic energy; this is j(mu 2 + Id 2 ), 
where v is the speed of the CM and I is the moment of inertia about the CM (2mb 2 /3). The safe way 
to find v is to write down the coordinates of the CM and then differentiate. 
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7.33 ** A bar of soap (mass m ) is at rest on a frictionless rectangular plate that rests on a horizontal 
table. At time t = 0, I start raising one edge of the plate so that the plate pivots about the opposite 
edge with constant angular velocity oo, and the soap starts to slide toward the downhill edge. Show that 
the equation of motion for the soap has the form x — oo 2 x = —g sin oot, where x is the soap’s distance 
from the downhill edge. Solve this for x(t), given that x(0) = x 0 . [You’ll need to use the method used 
to solve Equation (5.48). You can easily solve the homogeneous equation; for a particular solution try 
x = A sin cot and solve for A.] 

7.34 ** Consider the well-known problem of a cart of mass m moving along the x axis attached to a 
spring (force constant k), whose other end is held fixed (Figure 5.2). If we ignore the mass of the spring 
(as we almost always do) then we know that the cart executes simple harmonic motion with angular 
frequency co = k/m . Using the Lagrangian approach, you can find the effect of the spring’s mass 
M, as follows: (a) Assuming that the spring is uniform and stretches uniformly, show that its kinetic 
energy is \Mx 2 . (As usual x is the extension of the spring from its equilibrium length.) Write down 
the Lagrangian for the system of cart plus spring. {Note: The potential energy is still \kx 2 .) (b) Write 
down the Lagrange equation and show that the cart still executes SHM but with angular frequency 
oo — ,Jk/(m + A//3); that is, the effect of the spring’s mass M is just to add M/3 to the mass of the 
cart. 



7.35 ** Figure 7.16 is a bird’s-eye view of a smooth horizontal wire hoop that is forced to rotate at a 
fixed angular velocity oo about a vertical axis through the point A. A bead of mass m is threaded on the 
hoop and is free to move around it, with its position specified by the angle <fo that it makes at the center 
with the diameter AB. Find the Lagrangian for this system using c/o as your generalized coordinate. 
(Read the hint in Problem 7.29.) Use the Lagrange equation of motion to show that the bead oscillates 
about the point B exactly like a simple pendulum. What is the frequency of these oscillations if their 
amplitude is small? 

7.36 *** A pendulum is made from a massless spring (force constant k and unstretched length l 0 ) that 
is suspended at one end from a fixed pivot O and has a mass m attached to its other end. The spring 
can stretch and compress but cannot bend, and the whole system is confined to a single vertical plane, 
(a) Write down the Lagrangian for the pendulum, using as generalized coordinates the usual angle 0 
and the length r of the spring, (b) Find the two Lagrange equations of the system and interpret them 
in terms of Newton’s second law, as given in Equation (1.48). (c) The equations of part (b) cannot be 
solved analytically in general. However, they can be solved for small oscillations. Do this and describe 
the motion. [Hint: Let / denote the equilibrium length of the spring with the mass hanging from it and 
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write r = 1 + e. “Small oscillations” involve only small values of e and 0, so you can use the small- 
angle approximations and drop from your equations all terms that involve powers of € or 0 (or their 
derivatives) higher than the first power (also products of e and 0 or their derivatives). This dramatically 
simplifies and uncouples the equations.] 

7.37 ★★★ Two equal masses, m } = m 2 = m, are joined by a massless string of length L that passes 
through a hole in a frictionless horizontal table. The first mass slides on the table while the second 
hangs below the table and moves up and down in a vertical line, (a) Assuming the string remains taut, 
write down the Lagrangian for the system in terms of the polar coordinates (r, 0) of the mass on the table, 
(b) Find the two Lagrange equations and interpret the 0 equation in terms' of the angular momentum i 
of the first mass, (c) Express 0 in terms of £ and eliminate 0 from the r equation. Now use the r equation 
to find the value r = r 0 at which the first mass can move in a circular path. Interpret your answer in 
Newtonian terms, (d) Suppose the first mass is moving in this circular path and is given a small radial 
nudge. Write r(t) = r 0 + e(f) and rewrite the r equation in terms of e{t) dropping all powers of e(t) 
higher than linear. Show that the circular path is stable and that r(t) oscillates sinusoidally about r 0 . 
What is the frequency of its oscillations? 

7.38 *** A particle is confined to move on the surface of a circular cone with its axis on the vertical 
z axis, vertex at the origin (pointing down), and half-angle a. (a) Write down the Lagrangian £ in 
terms of the spherical polar coordinates r and 0. (b) Find the two equations of motion. Interpret the 0 
equation in terms of the angular momentum £ z , and use it to eliminate 0 from the r equation in favor 
of the constant i z . Does your r equation make sense in the case that i z = 0? Find the value r 0 of r 
at which the particle can remain in a horizontal circular path, (c) Suppose that the particle is given a 
small radial kick, so that r(t) = r Q + e(t), where e(t) is small. Use the r equation to decide whether 
the circular path is stable. If so, with what frequency does r oscillate about r 0 ? 

7.39 **★ (a) Write down the Lagrangian for a particle moving in three dimensions under the influence 
of a conservative central force with potential energy U(r), using spherical polar coordinates (r, 9 , 0). 
(b) Write down the three Lagrange equations and explain their significance in terms of radial accelera¬ 
tion, angular momentum, and so forth. (The 6 equation is the tricky one, since you will find it implies 
that the 0 component of l varies with time, which seems to contradict conservation of angular mo¬ 
mentum. Remember, however, that £$ is the component of l in a variable direction.) (c) Suppose that 
initially the motion is in the equatorial plane (that is, 9 0 — n/2 and 9 0 = 0). Describe the subsequent 
motion, (d) Suppose instead that the initial motion is along a line of longitude (that is, 0 O = 0). Describe 
the subsequent motion. 

7.40 ★★★ The “spherical pendulum” is just a simple pendulum that is free to move in any sideways 
direction. (By contrast a “simple pendulum” — unqualified — is confined to a single vertical plane.) 
The bob of a spherical pendulum moves on a sphere, centered on the point of support with radius r = R, 
the length of the pendulum. A convenient choice of coordinates is spherical polars, r, 0, 0, with the 
origin at the point of support and the polar axis pointing straight down. The two variables 0 and 0 make 
a good choice of generalized coordinates, (a) Find the Lagrangian and the two Lagrange equations, 
(b) Explain what the 0 equation tells us about the z component of angular momentum l z . (c) For the 
special case that 0 = const, describe what the 6 equation tells us. (d) Use the 0 equation to replace 0 
by £ z in the 6 equation and discuss the existence of an angle 9 0 at which 9 can remain constant. Why 
is this motion called a conical pendulum? (e) Show that if 9 = 9 0 + e, with e small, then 9 oscillates 
about 9 0 in harmonic motion. Describe the motion of the pendulum’s bob. 

7.41 ★★★ Consider a bead of mass m sliding without friction on a wire that is bent in the shape of a 
parabola and is being spun with constant angular velocity co about its vertical axis, as shown in Figure 
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7.17. Use cylindrical polar coordinates and let the equation of the parabola be z = kp 2 . Write down the 
Lagrangian in terms of p as the generalized coordinate. Find the equation of motion of the bead and 
determine whether there are positions of equilibrium, that is, values of p at which the bead can remain 
fixed, without sliding up or down the spinning wire. Discuss the stability of any equilibrium positions 
you find. 



7.42 **★ [Computer] In Example 7.7 (page 264), we saw that the bead on a spinning hoop can make 
small oscillations about its nonzero stable equilibrium points that are approximately sinusoidal, with 
frequency £2' = y/co 2 — ( g/coR ) 2 as in (7.80). Investigate how good this approximation is by solving 
the equation of motion (7.73) numerically and then plotting both your numerical solution and the 
approximate solution 0(f) — 0 O + A cos(£27 — 5) on the same graph. Use the following numbers: 
g = R = 1 and oj 2 = 2, and initial conditions 0(0) = 0 and 0(0) = 0 O + e 0 , where e 0 = 1°. Repeat 
with e 0 = 10°. Comment on your results. 

7.43 **★ [Computer] Consider a massless wheel of radius R mounted on a frictionless horizontal axis. 
A point mass M is glued to the edge, and a massless string is wrapped several times around the perimeter 
and hangs vertically down with a mass m suspended from its bottom end. (See Figure 4.28.) Initially I 
am holding the wheel with M vertically below the axle. At t m 0,1 release the wheel, and m starts to fall 
vertically down, (a) Write down the Lagrangian £ = T — U as a function of the angle 0 through which 
the wheel has turned. Find the equation of motion and show that, provided m < M, there is one position 
of stable equilibrium, (b) Assuming m < M, sketch the potential energy U (0) for —7r <<j> <4 tt and 
use your graph to explain the equilibrium position you found, (c) Because the equation of motion 
cannot be solved in term s of elementary functions, you are going to solve it numerically. This requires 
that you choose numerical values for the various parameters. Take M = g = R = 1 (this amounts to 
a convenient choice of units) and m = 0.7. Before solving the equation make a careful plot of U (0) 
against 0 and predict the kind of motion expected when M is released from rest at 0 = 0. Now solve 
the equation of motion for 0 < t < 20 and verify your prediction, (d) Repeat part (c), but with m = 0.8. 

7.44 *★* [Computer] If you haven’t already done so, do Problem 7.29. One might expect that the 
rotation of the wheel would have little effect on the pendulum, provided the wheel is small and rotates 
slowly, (a) Verify this expectation by solving the equation of motion numerically, with the following 
numbers: Take g and / to be 1. (This means that the natural frequency y/g/l of the pendulum is also 
1.) Take co = 0.1, so that the wheel’s rotational frequency is small compared to the natural frequency 
of the pendulum; and take the radius R = 0.2, significantly less than the length of the pendulum. As 
initial conditions take 0 = 0.2 and 0 = 0 at t = 0, and make a plot of your solution 0 (t) for 0 < t < 20. 
Your graph should look very like the sinusoidal oscillations of an ordinary simple pendulum. Does the 
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period look correct? (b) Now plot <p{t) for 0 < t < 100 and notice that the rotating support does make 
a small difference, causing the amplitude of the oscillations to grow and shrink periodically. Comment 
on the period of these small fluctuations. 


section 7.8 More About Conservation Laws* 


7.45 ** (a) Verify that the coefficients A t j in the important expression (7.94) for the kinetic energy of 
any “natural” system are symmetric; that is, A t j = Aj t . (b) Prove that for any n variables v h • • ■, v n 

1 jA j 

[Hint: Start with the case that n = 2, for which you can write out the sums in full. Notice that you 
need the result of part (a).] This identity is useful in many areas of physics; we needed it to prove the 
expression (7.96) for the generalized momentum p r 

7.46 ** Noether’s theorem asserts a connection between invariance principles and conservation laws. 
In Section 7.8 we saw that translational invariance of the Lagrangian implies conservation of total linear 
momentum. Here you will prove that rotational invariance of £ implies conservation of total angular 
momentum. Suppose that the Lagrangian of an N -particle system is unchanged by rotations about a 
certain symmetry axis, (a) Without loss of generality, take this axis to be the z axis, and show that 
the Lagrangian is unchanged when all of the particles are simultaneously moved from ( r a , 9 a , (p a ) to 
(r a , 9 a ,(f) a + €) (same e for all particles). Hence show that 


T— =o. 

si 3 *- 


(b) Use Lagrange’s equations to show that this implies that the total angular momentum L z about the 
symmetry axis is constant. In particular, if the Lagrangian is invariant under rotations about all axes, 
then all components of L are conserved. 

7.47 *★* In Chapter 4 (at the end of Section 4.7) I claimed that, for a system with one degree of freedom, 
positions of stable equilibrium “normally” correspond to minima of the potential energy U (q). Using 
Lagrangian mechanics, you can now prove this claim, (a) Consider a one-degree system of N particles 
with positions r a = r a (q), where q is the one generalized coordinate and the transformation between 
r and q does not depend on time; that is, q is what we have now agreed to call “natural.” (This is the 
meaning of the qualification “normally” in the statement of the claim. If the transformation depends 
on time, then the claim is not necessarily true.) Prove that the KE has the form T = \Aq 2 , where 
A = A(q) >0 may depend on q but not on q. [This corresponds exactly to the result (7.94) for n 
degrees of freedom. If you have trouble with the proof here, review the proof there.] Show that the 
Lagrange equation of motion has the form 


A(q)q = 


dU_ 

dq 


ldA_. 2 

2 dq q ‘ 


(b) A point q 0 is an equilibrium point if, when the system is placed at q 0 with q = 0, it remains there. 
Show that q 0 is an equilibrium point if and only if dU/dq = 0. (c) Show that the equilibrium is stable 
if and only if U is minimum at q 0 . (d) If you did Problem 7.30, show that the pendulum of that problem 
does not satisfy the conditions of this problem and that the result proved here is false for that system. 
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section 7.9 Lagrange’s Equations for Magnetic Forces* 

7.48 ★* Let F = F(q h • • •, q n ) be any function of the generalized coordinates (q A , • • •, q n ) of a system 
with Lagrangian L(q h ■ ■ ■, q n , q h ■ ■ ■, q n , t). Prove that the two Lagrangians L and £/ = L + dF/dt 
give exactly the same equations of motion. 

7.49 ★* Consider a particle of mass m and charge q moving in a uniform constant magnetic field B in 
the z direction, (a) Prove that B can be written as B = V x A with A={Bxr. Prove equivalently 
that in cylindrical polar coordinates, A = \Bpty. (b) Write the Lagrangian (7.103) in cylindrical polar 
coordinates and find the three corresponding Lagrange equations, (c) Describe in detail those solutions 
of the Lagrange equations in which p is a constant. 

section 7 .io Lagrange Multipliers and Constraint Forces* 

7.50* A mass m x rests on a frictionless horizontal table. Attached to it is a string which runs hori¬ 
zontally to the edge of the table, where it passes over a frictionless, small pulley and down to where 
it supports a mass m 2 . Use as coordinates x and y the distances of m l and m 2 from the pulley. These 
satisfy the constraint equation f(x, y) = x + y = const. Write down the two modified Lagrange equa¬ 
tions and solve them (together with the constraint equation) for x, y, and the Lagrange multiplier X. 
Use (7.122) (and the corresponding equation in y) to find the tension forces on the two masses. Verify 
your answers by solving the problem by the elementary Newtonian approach. 

7.51 * Write down the Lagrangian for the simple pendulum of Figure 7.2 in terms of the rectangular 
coordinates x and y. These coordinates are constrained to satisfy the constraint equation f(x,y) = 
yjx 2 + y 2 = l. (a) Write down the two modified Lagrange equations (7.118) and (7.119). Comparing 
these with the two components of Newton’s second law, show that the Lagrange multiplier is (minus) 
the tension in the rod. Verify Equation (7.122) and the corresponding equation in y. (b) The constraint 
equation can be written in many different ways. For example we could have written f'(x, y) = 
x 2 + y 2 = l 2 . Check that using this function would have given the same physical results. 

7.52 * The method of Lagrange multipliers works perfectly well with non-Cartesian coordinates. 
Consider a mass m that hangs from a string, the other end of which is wound several times around 
a wheel (radius R, moment of inertia 7) mounted on a frictionless horizontal axle. Use as coordinates 
for the mass and the wheel x, the distance fallen by the mass, and <£>, the angle through which the 
wheel has turned (both measured from some convenient reference position). Write down the modified 
Lagrange equations for these two variables and solve them (together with the constraint equation) for 
x and 4> and the Lagrange multiplier. Write down Newton’s second law for the mass and wheel, and 
use them to check your answers for x and ip. Show that Xdf/dx is indeed the tension force on the mass. 
Comment on the quantity Xdf/d(p. 




CHAPTER 


Two-Body Central-Force 

Problems 

In this chapter, I shall discuss the motion of two bodies each of which exerts a 
conservative, central force on the other but which are subject to no other, “external,” 
forces. There are many examples of this problem: the two stars of a binary star system, 
a planet orbiting the sun, the moon orbiting the earth, the electron and proton in a 
hydrogen atom, the two atoms of a diatomic molecule. In most cases the true situation 
is more complicated. For example, even if we are interested in just one planet orbiting 
the sun, we cannot completely neglect the effects of all the other planets; likewise, 
the moon-earth system is subject to the external force of the sun. Nevertheless, in all 
cases, it is an excellent starting approximation to treat the two bodies of interest as 
being isolated from all outside influences. 

You may also object that the examples of the hydrogen atom and the diatomic 
molecule do not belong in classical mechanics, since all such atomic-scale systems 
must really be treated by quantum mechanics. However, many of the ideas I shall 
develop in this chapter (the important idea of reduced mass, for instance) play a crucial 
role in the quantum mechanical two-body problem, and it is probably fair to say that 
the material covered here is an essential prerequisite for the corresponding quantum 
material. 


8.1 The Problem 


Let us consider two objects, with masses m l and m 2 . For the purposes of this chapter, 
I shall assume the objects are small enough to be considered as point particles, whose 
positions (relative to the origin O of some inertial reference frame) I shall denote 
by Fj and r 2 . The only forces are the forces F 12 and F 21 of their mutual interaction, 
which I shall assume is conservative and central. Thus the forces can be derived from 
a potential energy U (r l5 r 2 ). In the case of two astronomical bodies (the earth and 
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the sun, for instance) the force is the gravitational force Gm 1 m 2 /|r 1 — r 2 | 2 , with the 
corresponding potential energy (as we saw in Chapter 4) 


U(r h r 2 ) = - 


Gm l m 2 
l r i “ r 2 l ' 


(8.1) 


For the electron and proton in a hydrogen atom, the potential energy is the Coulomb 
PE of the two charges (e for the proton and —e for the electron), 


ke 2 

U(r h r 2 ) = —--(8.2) 

l r l - r 2l 

where k denotes the Coulomb force constant, k = \/Ane 0 . 

In both of these examples, U depends only on the difference Oh - r 2 ), not on 
r, and r 2 separately. As we saw in Section 4.9, this is no accident: Any isolated 
system is translationally invariant, and if U (r 1? r 2 ) is translationally invariant it can 
only depend on (r x — r 2 ). In the present case there is a further simplification: As we 
saw in Section 4.8, if a conservative force is central, then U is independent of the 
direction of Oh — r 2 ). That is, it only depends on the magnitude |r, — r 2 |, and we can 
write 


C(r 1 ,r 2 ) = t/(|r 1 -r 2 |) (8.3) 

as is the case in the examples (8.1) and (8.2). 

To take advantage of the property (8.3), it is convenient to introduce the new 
variable 


r = lb — r 2 . (8.4) 

As shown in Figure 8.1, this is just the position of body 1 relative to body 2, and 
I shall refer to r as the relative position. The result of the previous paragraph can be 
rephrased to say that the potential energy U depends only on the magnitude r of the 
relative position r, 


U = U (r). 


(8.5) 



1 


Figure 8.1 The relative position r = r, — r 2 is the position 
of body 1 relative to body 2. 
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We can now state the mathematical problem that we have to solve: We want to 
find the possible motions of two bodies (the moon and the earth, or an electron and a 
proton), whose Lagrangian is 

£ = \m x r \ + |m 2 i *2 — U(r). (8.6) 

Of course, I could equally have stated the problem in Newtonian terms, and I shall 
in fact feel free to move back and forth between the Lagrangian and Newtonian 
formalisms according to which seems the more convenient. For the present, the 
Lagrangian formalism is the more transparent. 


8.2 CM and Relative Coordinates; Reduced Mass 


Our first task is to decide what generalized coordinates to use to solve our problem. 
There is already a strong suggestion that we should use the relative position r as one 
of them (or as three of them, depending on how you count coordinates), because the 
potential energy U (r) takes such a simple form in terms of r. The question is then, 
what to choose for the other (vector) variable. The best choice turns out to be the 
familiar center of mass (or CM) position, R, of the two bodies, defined as in Chapter 
3 to be 

R = m 1 r 1 + m 2 r 2 m x r l + m 2 r 2 
m x + m 2 M 

where as before M denotes the total mass of the two bodies: 


M = m x + m 2 . 


As we saw in Chapter 3, the CM of two particles lies on the line joining them, as 
shown in Figure 8.2. The distances of the center of mass from the two masses m 2 and 
m x are in the ratio mf m 2 . In particular, if m 2 is much greater than m h then the CM 
is very close to body 2. (In Figure 8.2, the ratio mjm 2 is about 1/3, so the CM is a 
quarter of the way from m 2 to m,.) 



Figure 8.2 The center of mass of the two bodies lies at the 
position R = (m l r l + m 2 r 2 )/M on the line joining the two 
bodies. 
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We saw in Section 3.3 that the total momentum of the two bodies is the same as 
if the total mass M = m l + m 2 were concentrated at the CM and were following the 
CM as it moves: 

P - MR. (8.8) 

This result has important simplifying consequences: We know, of course, that the total 
momentum is constant. Therefore, according to (8.8), R is constant; and this means 
we can choose an inertial reference frame in which the CM is at rest. This CM frame 
is an especially convenient frame in which to analyze the motion, as we shall see. 

I am going to use the CM position R and the relative position r as generalized 
coordinates for our discussion of the motion of our two bodies. In terms of these 
coordinates, we already know that the potential energy takes the simple form U = 
U (r). To express the kinetic energy in these terms, we need to write the old variables 
rj and r 2 in terms of the new R and r. It is a straightforward exercise to show that (see 
Figure 8.2) 

r, = R + —r and r 2 = R - ^r. (8.9) 

M M 

Thus the kinetic energy is 

r = i(m 1 f 1 2 + m 2 f 2 2 ) 

= j (MR 2 + . (8.10) 

The result (8.10) simplifies further if we introduce the parameter 


/x a- maoi E — . 1— 1 - [reduced mass] (8.11) 

M tttf + OTj 


which has the dimensions of mass and is called the reduced mass. You can easily 
check that /jl is always less than both m x and m 2 (hence the name). If m 1 m 2 , then /x 
is very close to m b Thus the reduced mass for the earth-sun system is almost exactly 
the mass of the earth; the reduced mass of the electron and proton in hydrogen is 
almost exactly the mass of the electron. On the other hand, if mj = m 2 , then obviously 
M = 5 m v 

Returning to (8.10), we can rewrite the kinetic energy in terms of /x as 

T = ^MR 2 + \iix 2 . (8.12) 

This remarkable result shows that the kinetic energy is the same as that of two 
“fictitious” particles, one of mass M moving with the speed of the CM, and the other 
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of mass ix (the reduced mass) moving with the speed of the relative position r. Even 
more significant is the corresponding result for the Lagrangian: 

& = T — U = |MR 2 + (j/zr 2 - 17(r)) 

= £cm + £rci- (8-13) 

We see that by using the CM and relative positions as our generalized coordinates, 
we have split the Lagrangian into two separate pieces, one of which involves only the 
CM coordinate R and the other only the relative coordinate r. This will mean that we 
can solve for the motions of R and r as two separate problems, which will greatly 
simplify matters. 


8.3 The Equations of Motion 


With the Lagrangian (8.13), we can write down the equations of motion of our two- 
body system. Because £ is independent of R, the R equation (really three equations, 
one each for X, Y, and Z) is especially simple, 

MR = 0 or R = const. (8.14) 

We can explain this result in several ways: First (as we already knew), it is a direct 
consequence of conservation of total momentum. Alternatively, we can view it as 
reflecting that £ is independent of R, or, in the terminology introduced in Section 
7.6, the CM coordinate R is “ignorable.” More specifically, £ cm = \M R 2 (which is 
the only part of £ that involves R) has the form of the Lagrangian of a free particle 
of mass M and position R. Naturally, therefore (Newton’s first law), R moves with 
constant velocity. 

The Lagrange equation for the relative coordinate r is a little less simple but equally 
beautiful: £ rel , the only part of £ that involves r, is mathematically indistinguishable 
from the Lagrangian for a single particle of mass /x and position r, with potential 
energy U(r). Thus the Lagrange equation corresponding to r is just (check it and 
see!) 


fxr = —VC/ (r). (8.15) 

To solve for the relative motion, we have only to solve Newton’s second law for a 
single particle of mass equal to the reduced mass /x and position r, with potential 
energy U(r). 


The CM Reference Frame 

Our problem becomes even easier to think about if we make a clever choice of 
reference frame. Specifically, because R = const, we can choose an inertial reference 
frame, the so-called CM frame, in which the CM is at rest and the total momentum 
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Figure 8.3 In the CM frame the center of mass is stationary at 
the origin. The relative position r is the position of particle 1 
relative to particle 2; therefore, the position of particle 1 relative 
to the origin is r, = (m 2 /M) r. 


is zero. In this frame, R = 0 and the CM part of the Lagrangian is zero (X cm = 0). 
Thus in the CM frame 

£ = C re i = \/ir 2 - U(r) (8.16) 

and the problem really is reduced to a one-body problem. This dramatic simplifica¬ 
tion illustrates the curious terminology of the “ignorable coordinate.” Recall that a 
coordinate q t is said to be ignorable if dL/dq i = 0. We see that, in the present case at 
least, the motion associated with the ignorable coordinate R really is something that 
we can ignore. 

It is worth taking a moment to consider what the motion looks like in the CM 
frame, as shown in Figure 8.3. The CM is stationary, and we naturally take it to be 
the origin. Both particles are moving, but with equal and opposite momenta. If m 2 is 
much greater than m x (as is often the case), the CM is close to m 2 and particle 2 has 
a small velocity. (In the figure, m 2 = 3m 1 and hence v 2 = jiq.) It is important to note 
that the relative position r is the position of particle 1 relative to particle 2, and is not 
the actual position of either particle. As shown in the picture, the position of particle 1 
is actually rj = ( m 2 /M)r. However, if m 2 ^> m h then the CM is very close to particle 
2, which is almost stationary, and ~ r; that is, r is very nearly the same thing as r x . 

The equation of motion in the CM frame is derived from the Lagrangian £ rel of 
(8.16) and is just Equation (8.15). This is precisely the same as the equation for a 
single particle of mass equal to the reduced mass /z, in the fixed central force field of 
the potential energy U(r). In the equations of this chapter, the repeated appearance 
of the mass /i will serve to remind you that the equations apply to the relative motion 
of two bodies. However, you may find it easier to visualize a single body (of mass 
fi) orbiting about a fixed force center. In particular, if m 2 ^> m h these two problems 
are for practical purposes exactly the same. Moreover, if your interest actually is in 
a single body, of mass m say, orbiting a fixed force center, then you can use all of 
the same equations, simply replacing \x with ra. In any event, any solution for the 
relative coordinate r (t) always gives us the motion of particle 1 relative to particle 2. 
Equivalently, using the relations of Figure 8.3, knowledge of r(r) tells us the motion 
of particle 1 (or particle 2) relative to the CM. 
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Conservation of Angular Momentum 

We already know that the total angular momentum of our two particles is conserved. 
Like so many other things, this condition takes an especially simple form in the CM 
frame. In any frame, the total angular momentum is 

L = r 1 xp 1 + r 2 xp 2 

= m x r x x rj + m 2 r 2 x r 2 . (8.17) 

In the CM frame, we see from (8.9) (with R = 0) that 

r, = — r and r 2 = -^r. (8.18) 

1 M 2 M 

Substituting into (8.17), we see that the angular momentum in the CM frame is 
_ m.m 2/ 

L = — -— ± (m 2 r x r + m.r x r) 

M 2 

= rx/ir (8.19) 

where I have replaced m 1 m 2 /M by the reduced mass jx. 

The most remarkable thing about this result is that the total angular momentum in 
the CM frame is exactly the same as the angular momentum of a single particle with 
mass ix and position r. For our present purposes the important point is that, because 
angular momentum is conserved, we see that the vector r x r is constant. In particular, 
the direction of r x r is constant, which implies that the two vectors r and r remain 
in a fixed plane. That is, in the CM frame, the whole motion remains in a fixed plane, 
which we can take to be the xy plane. In other words, in the CM frame, the two-body 
problem with central conservative forces is reduced to a two-dimensional problem. 


The Two Equations of Motion 

To set up the equations of motion for the remaining two-dimensional problem, we 
need to choose coordinates in the plane of the motion. The obvious choice is to use 
the polar coordinates r and 0, in terms of which the Lagrangian (8.16) is 

£ = \ii{r 2 + r 2 0 2 ) -U{r). (8.20) 

Since this Lagrangian is independent of 0, the coordinate 0 is ignorable, and the 
Lagrange equation corresponding to 0 is just 

~ = ixr 2 <p = const = t [0 equation]. (8.21) 

30 

Since /zr 2 0 is the angular momentum l (strictly, the z component £ z ), the 0 equation 
is just a statement of conservation of angular momentum. 

The Lagrange equation corresponding to r (often called the radial equation) is 

3£ _ d_ dL 
dr dt dr ’ 
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or 

[xr<p 2 - — = pr [r equation]. (8.22) 

dr 

As we already saw in Example 7.2 [Equations (7.19) and (7.20)], if we move the 
centripetal term ixr<j> 2 over to the right, this is just the radial component of F = ma 
(or rather, F = /xa, since /x has replaced m). 


8.4 The Equivalent One-Dimensional Problem 


The two equations of motion that we have to solve are the 0 equation (8.21) and 
radial equation (8.22). The constant t (the angular momentum) in the 0 equation is 
determined by the initial conditions, and our main use for the 0 equation is to solve 
it for 0, 

0 = (8.23) 

which will let us eliminate 0 from the radial equation in favor of the constant t. The 
radial equation can be rewritten as 

lir = - + /*r0 2 = - ~ + F cf (8.24) 

dr dr 

which has the form of Newton’s second law for a particle in one dimension with 
mass /x and position r, subject to the actual force —dU/dr plus a “fictitious” outward 
centrifugal force 1 

F a = Mr0 2 . (8.25) 


In other words, the particle’s radial motion is exactly the same as if the particle were 
moving in one dimension, subject to the actual force —dU/dr plus the centrifugal 
force F cf . 

We have now reduced the problem of the relative motion of two bodies to a 
single one-dimensional problem, as expressed by (8.24). Before we discuss what the 
solutions are going to look like, it is helpful to rewrite the centrifugal force, using the 
0 equation (8.23) to eliminate 0 in favor of the constant l. 



(8.26) 


Even better, we can now express the centrifugal force in terms of a centrifugal potential 
energy, 


*■* = - 


d_ 

dr 



dU ci 

dr ’ 


(8.27 


1 This centrifugal force may be a little more familiar if I write it in terms of the azimuthal velocit 
v# = r0 as F cf = nv^/r. 
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where the centrifugal potential energy U cf is defined as 


U cf (r) = 


l 1 

2ixr 2 


(8.28) 


Returning to (8.24), we can now rewrite the radial equation in terms of U cf as 


-4-(U(r) + U c ,(r)l=~l >„(!•). 
dr dr 


(8.29) 


where the effective potential energy t/ eff (r) is the sum of the actual potential energy 
U (r) and the centrifugal U cf (r): 


U e f((r ) = U(r) + Utf(r ) = U(r) + 


2 nr-' 


(8.30) 


According to (8.29), the radial motion of the particle is exactly the same as if the 
particle were moving in one dimension with an effective potential energy £/ eff = 
U + U cf . 


example 8.1 Effective Potential Energy for a Comet 

Write down the actual and effective potential energies for a comet (or planet) 
moving in the gravitational field of the sun. Sketch the three potential energies 
involved and use the graph of t/ eff (r) to describe the motion of r. Since planetary 
motion was first described mathematically by the German astronomer Johannes 
Kepler, 1571-1630, this problem of the motion of a planet or comet around the 
sun (or any two bodies interacting via an inverse-square force) is often called 
the Kepler problem. 

The actual gravitational potential energy of the comet is given by the well- 
known formula 

(/(r) = _Gm ! m 2 (8.31) 

r 

where G is the universal gravitational constant, and m x and m 2 are the masses 
of the comet and the sun. The centrifugal potential energy is given by (8.28), so 
the total effective potential energy is 


U efi (r) = - 


Gm { m 2 

r 



(8.32) 


The general behavior of this effective potential energy is easily seen (Figure 
8.4). When r is large, the centrifugal term f 2 /2/rr 2 is negligible compared to 
the gravitational term —Gm\m 2 lr, and the effective PE, f/ eff (r), is negative and 
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Figure 8.4 The effective potential energy t/ eff (r) that governs the 
radial motion of a comet is the sum of the actual gravitational potential 
energy U(r) = —Gm x m 2 jr and the centrifugal term U cf = l 2 /2fxr 2 . 

For large r, the dominant effect is the attractive gravitational force; for 
small r, it is the repulsive centrifugal force. 

sloping up as r increases. According to (8.29), the acceleration of r is down 
this slope. [The roller coaster car accelerates down the track defined by C/ eff (r).] 
Thus when a comet is far from the sun, r is always inward. 

When r is small, the centrifugal term f 2 /2/xr 2 dominates the gravitational 
term — Gm l m 2 /r (unless l = 0), and near r — 0, t/ eff (r) is positive and slopes 
downward. Thus, as a comet gets closer to the sun, r eventually becomes 
outward, and the comet starts to move away from the sun again. The one 
exception to this statement is when the angular momentum is exactly zero, t = 0, 
in which case (8.23) implies that 0 = 0; that is, the comet is moving exactly 
radially, along a line of constant 0, and must at some time hit the sun. 


Conservation of Energy 

To find the details of the orbit we must look more closely at the radial equation (8.29). 
If we multiply both sides of that equation by r , we find that 

sM— (833) 

In other words, . 


jiir 2 + C/ e ff(r) = const. (8.34) 

This result is, in fact, just conservation of energy; If we write out U eff as U + f 2 /2 jir 2 
and replace l by /xr 2 0, we see that 

\lir 2 + U e ff(r ) = \iir 2 + \nr 2 4\ 2 + U ( r) 


= E. 


(8.35) 
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This completes the rewriting of the two-dimensional problem of the relative motion as 
an equivalent one-dimensional problem involving just the radial motion. We see that 
the total energy (which we knew all along is constant) can be thought of as the one¬ 
dimensional kinetic energy of the radial motion, plus the effective one-dimensional 
potential energy £/ eff , since the latter includes the actual potential energy U and the 
kinetic energy t/zr 2 0 2 of the angular motion. This means that all of our experience 
with one-dimensional problems, both in terms of forces and in terms of energy, can 
be immediately transferred to the two-body central-force problem. 

example 8.2 Energy Considerations for a Comet or Planet 

Examine again the comet (or planet) of Example 8.1 and, by considering its 
total energy E, find the equation that determines the maximum and minimum 
distances of the comet from the sun, if E > 0 and, again, if E < 0. 

In the energy equation (8.35) the term \jir 2 on the left is always greater j 
than or equal to zero. Therefore, the comet’s motion is confined to those regions 
where E > t/ eff . To see what this implies, I have redrawn in Figure 8.5 the graph 
of t/ eff from Figure 8.4. Let us consider first the case that the comet’s energy is 
greater than zero. In the figure I have drawn a dashed horizontal line at height 
E, labeled E > 0. A comet with this energy can move anywhere that this line 
is above the curve of t/ eff (r), but nowhere that the line is below the curve. This 
means simply that the comet cannot move anywhere inside the turning point 
labeled r min , determined by the condition 

U c{l (r min ) = E. (8.36) | 

If the comet is initially moving in, toward the sun, then it will continue to do so 
until it reaches r min , where r = 0 instantaneously. It then moves outward, and, 
since there are no other points at which r can vanish, it eventually moves off to 
infinity, and the orbit is unbounded. 

If instead E < 0, then the line drawn at height E (labeled E < 0) meets the j 
curve of £/ eff (r) at the two turning points labeled r min and r max , and a comet with j 



Figure 8.5 Plot of the effective potential energy U eff (r ) against r for a 
comet. For a given energy E, the comet can only go where E > U eff (r). 
For E > 0 this means it cannot go inside the turning point at r min where 
t/ eff = E. For E < 0 it is confined between the two turning points 
labeled r min and r max . 




304 Chapter 8 Two-Body Central-Force Problems 


} E < 0 is trapped between these two values of r . If it is moving away from the sun 
j (r > 0) it continues to do so until it reaches r max , where r vanishes and reverses 
j sign. The comet then moves inward until it reaches r min , where r reverses again, 
j Therefore, the comet oscillates in and out between r min and r max . For obvious 
] reasons, this type of orbit is called a bounded orbit. 2 

1 Finally, if E is equal to the minimum value of U eff (r) (for a given value of 
the angular momentum l), the two turning points r min and r max coalesce, and the 
I comet is trapped at a fixed radius and moves in a circular orbit. 


In this example, I considered just the case of an inverse-square force, but many two- 
body problems have the same qualitative features. For example, the motion of the two 
atoms in a diatomic molecule is governed by an effective potential that was sketched 
in Figure 4.12 and looks very like the gravitational curve of Figure 8.5. Thus all of 
our qualitative conclusions apply to the diatomic molecule and many other two-body 
problems. 

In thinking about the radial motion of the two-body problem, you must not entirely 
forget the angular motion. According to (8.23), 0 = ljnr 2 , and 0 is always changing, 
always with the same sign (continually increasing or continually decreasing). For 
example, as a comet with positive energy approaches the sun, the angle 0 changes, at 
a rate that increases as r gets smaller; as the comet moves away, 0 continues to change 
in the same direction, but at a rate that decreases as r gets larger. Thus the actual orbit 
of a positive-energy comet looks something like Figure 8.6. For the case of an inverse- 
square force (like gravity), the orbit of Figure 8.6 is actually a hyperbola, as we shall 
prove shortly, but the unbounded orbits (that is, orbits with E > 0) are qualitatively 
similar for many different force laws. 

For the bounded orbits {E < 0), we have seen that r oscillates between the two 
extreme values r mjn and r max , while 0 continually increases (or decreases, but let’s 
suppose the comet is orbiting counter-clockwise, so that 0 is increasing). In the case of 
the inverse-square force, we shall see that the period of the radial oscillations happens 
to equal the time for 0 to make exactly one complete revolution. Therefore, the motion 
repeats itself exactly once per revolution, as in Figure 8.7(a). (We shall also see that, 
for any inverse-square force, the bounded orbits are actually ellipses.) For most other 
force laws, the period of the radial motion is different from the time to make one 
revolution, and in most cases the orbit is not even closed (that is, it never returns to 
its initial conditions). 3 Figure 8.7(b) shows an orbit for which r goes from r min to r max 
and back to r min in the time that the angle 0 advances by about 330°, and the orbit 
certainly does not close on itself after one revolution. 


2 If we consider just one comet in orbit around the sun, then energy conservation implies that a 
bounded orbit (E < 0) can never change into an unbounded orbit (E > 0), nor vice versa. In reality 
a comet can occasionally come close enough to another comet or planet to change E, and the orbit 
can then change from bounded to unbounded or the other way. 

3 Besides the inverse square force, the only important exception is the isotropic harmonic 
oscillator, for which the orbits are also ellipses, as discussed in Section 5.3. 
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Figure 8.6 Typical unbounded orbit for a positive-energy comet. 
Initially r decreases from infinity to r min and then goes back out 
to infinity. Meanwhile the angle 4> is continually increasing. 



Figure 8.7 (a) The bounded orbits for any inverse-square force 

have the unusual property that r goes from r min to r max and back 
to r min in exactly the time that <p goes from 0 to 360°. Therefore 
the orbit repeats itself every revolution, (b) For most other force 
laws, the period of oscillation of r is different from the time in 
which (f> advances by 360°, and the orbit does not close on itself 
after one revolution. In this example, r completes one cycle from 
r min to r max and back to r min while 0 advances by about 330°. 


8.5 The Equation of the Orbit 


The radial equation (8.29) determines r as a function of t, but for many purposes we 
would like to know r as a function of 4>. For example, the function r = r(<p) will tell us 
the shape of the orbit more directly. Thus we would like to rewrite the radial equation 
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as a differential equation for r in terms of p. There are two tricks for doing this, but 
let me first write the radial equation in terms of forces: 

fl r = F(r) + — (8.37) 

fir 5 

where F(r ) is the actual central force, F = —dU/dr, and the second term is the 
centrifugal force. 

The first trick to rewriting this equation in terms of 0 is to make the substitution 

u = - or r — — (8.38) 

r u 

and the second is to rewrite the differential operator d/dt in terms of d/dp using the 
chain rule: 


± = W± = j,± = -L± = *jl±. (8 .39) 

dt dt d(f) dp fir 2 dp fi dtp 

(The third equality follows because i = fir 2 p, and the last results from the change of 
variables u = 1/r.) 

Using the identity (8.39) we can rewrite r on the left of the radial equation. First 


j. _ d ^ ^ d f 1\ _ i du 

dt fi dp \u J fi dp 

and hence 

.. _ d _ iu 2 d ( i du \ _ l 2 u 2 d 2 u 

dt fi dp V fi dp) fi 2 dp 2 


(8.40) 


Substituting back into the radial equation (8.37) we find 

- F + — 
fi dp 2 fi 


or 


u'Xp) ~ ~u{p) 


-.—— F. 

e 2 u(p ) 2 


(8.41) 


For any given central force F, this transformed radial equation is a differential 
equation for the new variable u(p). If we can solve it, then we can immediately write 
down r as r = 1/m. In the next section, we shall solve it for the case of an inverse- 
square force and show that the resulting orbits are conic sections, that is, ellipses, 
parabolas, or hyperbolas. First, here is a simpler example. 
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example 8.3 The Radial Equation for a Free Particle 

Solve the transformed radial equation (8.41) for a free particle (that is, a particle 
subject to no forces) and confirm that the resulting orbit is the expected straight 
line. 

This example is probably one of the hardest ways of showing that a free 
particle moves along a straight line. Nevertheless, it is a nice check that the 
transformed radial equation makes sense. In the absence of forces, (8.41) is just 

«"(</>) - -k(0) 

whose general solution we know to be 

u ((/>) = Acos(<£-S), (8.42) 


where A and 8 are arbitrary constants. Therefore, (renaming the constant A = 

l/r Q ) 


r(0) = 


1 


u(<p) cos(0 — 8) 


(8.43) 


This unpromising-looking equation is in fact the equation of a straight line in 
polar coordinates, as you can see from Figure 8.8. In that picture Q is a fixed 
point with polar coordinates (r 0 , 8), and the line in question is the line through 
Q perpendicular to OQ. It is easy to see that the point P with polar coordinates 
(r, 4 >) lies on this line if and only if r cos (4> — 8) = r Q . In other words. Equation 
(8.43) is the equation of this straight line. 



Figure 8.8 The fixed point Q has polar coordinates (r 0 , 8) rel¬ 
ative to the origin O. The point P with polar coordinates (r, <p) 
lies on the line through Q perpendicular to OQ if and only if 
r cos(0 — <$) = r 0 . That is, the equation of this line is (8.43). 
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In the next section, I shall use the same transformed radial equation (8.41) to solve 
a much less trivial problem, finding the path of a comet or any other body held in orbit 
by an inverse-square force. 


8.6 The Kepler Orbits 


Let us now return to the Kepler problem, the problem of finding the possible orbits 
of a comet or any other object subject to an inverse-square force. The two important 
examples of this problem are the motion of comets or planets around the sun (or 
earth satellites around the earth), in which case the force is the gravitational force 
— Gm x m 1 l r 2 , and the orbital motion of two opposite charges and q 2 , in which case 
the force is the Coulomb force kq x q 2 /r 2 . To include both cases and to simplify the 
equations, I shall write the force as (remember that u = 1/r) 

F(r) = ~ = -Yu 2 , (8.44) 

r z 

where y is the “force constant,” equal to Gm ] m 2 in the gravitational case. 4 

Thanks to our elaborate preparations, we can now solve the main problem very 
easily. Inserting the force (8.44) into the transformed radial equation (8.41), we find 
that w(0) must satisfy 

w"(0) = -u(0) + yiL/l 2 . (8.45) 

Notice that it is a unique feature of the inverse-square force that the last term in this 
equation is a constant, since only in this case does the u 2 of the force cancel the 1/m 2 
in (8.41). Because this last term is constant, we can solve (8.45) very easily: If we 
substitute 


w(0) = m(0) - Yll/l 2 , 


the equation becomes 


w"((f>) = -w(0). 


which has the general solution 


w (0) = A cos(0 — 5), (8.46) 

where A is a positive constant and S is a constant that we can take to be zero by 
a suitable choice of the direction 0 = 0. Thus the general solution for m(0) can be 
written as 


u(.(p) = ^ + Acos(t) = ^(\ + € cos0) 


(8.47) 


4 The constant y is positive for the gravitational force and for the force between two opposite 
charges. As discussed in Problem 8.31, for two charges of the same sign, y is negative. For now, 
we’ll assume it is positive. 
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where e is just a new name for the dimensionless positive constant At 2 /y/x. Since 
u = 1 fr, the constant y\i/t 2 on the right has the dimensions [ 1 /length], and I shall 
introduce the length 


c = — 


KM 


(8.48) 


in terms of which our solution becomes 


- 7 — = -(1 + ecos 0 ) 
r(<p) c 


or 


r( 0 ) =---. 

1 4- € cos 0 


(8.49) 


This is our solution for r as a function of 0, in terms of the undetermined positive 
constant e and the length c = t 2 /yii (which is t 2 / Gm x m 1 [i in the gravitational 
problem). I shall now explore its properties, first for the bounded orbits and then for 
the unbounded. 


The Bounded Orbits 

The behavior of the orbit r(0) in (8.49) is controlled by the as-yet undetermined 
positive constant e. A glance at (8.49) shows this behavior is very different according 
as € < 1 or € > 1. If € < 1, the denominator of (8.49) never vanishes, and r(0) 
remains bounded for all 0. If e > 1 the denominator vanishes at some angle, and 
r(0) approaches infinity as 0 approaches that angle. Evidently the value e = 1 is the 
boundary between the bounded and unbounded orbits. I shall show shortly that this 
boundary corresponds exactly to the boundary between E < 0 and E > 0 discussed 
before. Meanwhile, let us start with the case that the constant e is less than 1. With 
€ < 1, the denominator of r(0) in (8.49) oscillates as shown in Figure 8.9 between 
the values 1 ± e. Therefore, r(0) oscillates between 

r min — —~— and r max = —-— (8.50) 

1+6 1-e 

with r = r min at the so-called perihelion when 0 = 0 , and r = r max at the aphelion 
when 0 = 7T. Since r(0) is obviously periodic in 0 with period 2jr, it follows that 
r(2jz ) — r(0) and the orbit closes on itself after just one revolution. Thus the general 
appearance of the orbit is as in Figure 8.10. 

While the orbit shown in Figure 8.10 certainly looks like an ellipse, I have not yet 
proved that it really is. However, it is a reasonably easy exercise (see Problem 8.16) 
to rewrite (8.49) in Cartesian coordinates and cast it in the form 

(x+d) 2 ,y 2 _ 

a 2 b 2 


(8.51) 
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Figure 8.9 The denominator 1 -f € cos 0 in Equation (8.49) for 
r((p) oscillates between 1 + e and 1 — e, and is periodic with 
period 2it. 


where (as you can easily check) 


d m ae. 


(8.52) 


Equation (8.51) is the standard equation of an ellipse with semimajor and semiminor 
axes a and b, except that where we expect to see x we have x + d. This difference 
reflects that our origin, the sun, is not at the center of the ellipse, but at a distance d 
from it, as shown in Figure 8.10. 

We can now identify the constant e, which started life as an undetermined constant 
of integration in (8.47). According to (8.52) the ratio of the major to minor axes is 



a 



Figure 8.10 The bounded orbits of a comet or planet as given by Equation 
(8.49) are ellipses. The sun is at the origin O, which is one focus of the 
ellipse (not the center C). The distances a and b are called the semimajor and 
semiminor axes. The parameter c = i 2 !y\i introduced in (8.48) is the value 
of r when <p — 90°. The points where the comet is closest and farthest from 
the sun are called the perihelion and aphelion. 


M 
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Although you almost certainly don’t remember it, this equation is the definition (or 
one possible definition) of the eccentricity of the ellipse. That is, this equation tells us 
that the constant e is the eccentricity. Notice that if € = 0, then b = a and the ellipse 
is a circle; if e —> 1, then b/a —> 0 and the ellipse becomes very thin and elongated. 

Having identified the constant e as the eccentricity, we can now identify the 
position of the sun in relation to the ellipse. According to (8.52) the distance from 
the center C to the sun at O is d — ae, and (though again you may not remember it) 
ae is the distance from the center to either focus of the ellipse. Thus the position of 
the sun is actually one of the ellipse’s two focuses, and we have now proved Kepler’s 
first law, that the planets (and comets whose orbits are bounded) follow orbits that 
are ellipses with the sun at one focus. 


example 8.4 Halley’s Comet 

Halley’s comet, named for the English astronomer Edmund Halley (1656- 
1742), follows a very eccentric orbit with e = 0.967. At closest approach (the 
perihelion) the comet is 0.59 AU from the sun, fairly close to the orbit of 
Mercury. (The AU or astronomical unit is the mean distance of the earth from 
the sun, about 1.5 x 10 8 km.) What is the comet’s greatest distance from the 
sun, that is, the distance of the aphelion? 

The given distance is r min = 0.59 AU, and, according to (8.50), r max /r min = 
(1 + e)/(l — O- Therefore 


_ 1 + e r = 1-967 ^ =6Qr 

W 1 - e rmin 0.033 rmin 


= 35 AU. 


This means that at its greatest distance Halley’s comet is outside the orbit of 
Neptune. 


The Orbital Period; Kepler’s Third Law 

We can now find the period of the elliptical orbits of the comets and planets. According 
to Kepler’s second law (Section 3.4), the rate at which a line from the sun to a comet 
or planet sweeps out area is 


dA_J_ 

dt 2/u, 

Since the total area of an ellipse is A = Ttab, the period is 

A htabii 
T “ dA/dt ~ £ ' 

If we square both sides and use (8.53) to replace b 2 by a 2 (l — e 2 ), this becomes 

r 2 = ^ 2 Ai- € y 2 = 

£ 2 l 2 ’ 
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where in the last equality I used (8.52) to replace a( 1 — e 2 ) by c. Since the length c 
was defined in (8.48) as l 2 lyix, this implies that 

r 2 = 4;r 2 ^. (8.54) 

Y 

Finally, y is the constant in the inverse-square force law F = —y/r 2 , and, for the 
gravitational force, y = Gm x m 2 — G/iM where M is the total mass, M = mj + m 2 . 
(Notice the handy identity that m x m 2 = /xM.) In our case m 2 = M s , the mass of the 
sun, which is very much greater than m h the mass of the comet or planet. Thus, to an 
excellent approximation, M M s , and 

y = Gm l m 2 ~ G/iM s . 

Therefore, the factor of /x in (8.54) cancels, and we find that 



This is Kepler’s third law: Because the mass of the comet (or planet) has canceled 
out, the law says that for all bodies orbiting the sun, the square of the period is 
proportional to the cube of the semimajor axis. (For circular orbits, we can replace 
a 3 by r 3 .) The law applies equally to the satellites of any large body. For example, all 
satellites of the earth, including the moon, obey the same law [with M s replaced by 
the earth’s mass M e in (8.55)], and the same applies to all the moons of Jupiter. 

example 8.5 Period of a Low-Orbit Earth Satellite 

Use Kepler’s third law to estimate the period of a satellite in a circular orbit a 
j few tens of miles above the earth’s surface. 

The period is given by (8.55) with M s replaced by M e . Since the orbit is 
circular, we can replace a by r, and since the orbit is close to the earth’s surface, 
r ~ R a , the radius of the earth. Therefore 



This simplifies if we recall that GMJR 2 = g, the acceleration of gravity on 
the earth’s surface, and we find that 

„ [Rz _ /6.38 x 10 6 m oc . 

r = 2 7t / — = 2nJ -^— = 5070 s ^ 85 mm, (8.56) 

V g V 9.8 m/s 2 

in agreement with the well-known observation that low-orbit satellites circle the 
earth in about one and a half hours. 
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Relation between Energy and Eccentricity 


Finally, we can relate the eccentricity € of the orbit to the energy E of the comet or 
other orbiting body. The simplest way to do this is to remember that, at its distance 
of closest approach r min , the comet’s energy is equal to the effective potential energy 
U eff [Equation (8.36)], 


E - £/eff(r rai n) = - 


2fxr m] 


1 

2r mit 


(£-4 


(8.57) 


Now we know from (8.50) that r min = c/(l + e), and from its definition (8.48) that 
c = l 2 /yii. Therefore 

l 2 

rmm K/x(l + €) 

and, substituting into (8.57), 


E= YlAl + <) [y(1 + €) _ 2y] 

= ^t£-D- (8-58) 

The calculations leading to (8.58) are equally valid for bounded and unbounded 
orbits, and they imply the following expected correlations: Negative energies (E < 0) 
correspond to eccentricities e < 1, which in turn correspond to bounded orbits. Posi¬ 
tive energies (E > 0) correspond to eccentricities e > 1, which in turn correspond to 
unbounded orbits. Equation (8.58) is a useful relation between the mechanical proper¬ 
ties E and l and the geometrical property €. It implies some interesting connections. 
For example, for a given value of the angular momentum t, the orbit of lowest pos¬ 
sible energy is the circular orbit with € = 0 (a connection which has an important 
counterpart in quantum mechanics). 


8.7 The Unbounded Kepler Orbits 


In the previous section, we found the general Kepler orbit, as given by (8.49), 


r (<f>) = T~~~ -- . 

1 + € COS (j) 


(8.59) 


and examined in detail the bounded orbits — those for which e < 1 or, equivalently, 
as we have seen, E < 0. In this section, I shall sketch the corresponding analysis of 
the unbounded orbits, with € > 1 and E > 0. 

The boundary between the bounded and unbounded orbits comes when e = 1 or 
E = 0. With € = 1, the denominator of (8.59) vanishes when 0 = ±jt. Therefore, 
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Figure 8.11 Four different Kepler orbits for a comet: a circle, an 
ellipse, a parabola, and a hyperbola. For clarity, the four orbits 
were chosen with the same values for r min and with the closest 
approaches all in the same direction. 


r(0) -* oo as 0 -> ±7T. That is, if € = 1, the orbit is unbounded and goes off to 
infinity as the comet approaches 0 = ±n. Some elementary algebra, parallel to what 
led to (8.51), shows that with e = 1 the Cartesian version of (8.59) is 

y 2 = c 2 - 2cx (8.60) 

which is the equation of a parabola. This orbit is shown (with the long dashes) in 
Figure 8.11. 

If e > 1 (or E > 0), the denominator of (8.59) vanishes at a value 0 max determined 
by the condition 

€ COS (0 max ) = — 1. 

Thus r(0) oo when 0 ±0 max and the orbit is confined to the range of angles 
—0 max < 0 < 0 max . This gives the orbit the general appearance sketched in Figure 
8.6. With c > 1 the Cartesian form of (8.59) is (Problem 8.30) 

C x-8 ) 2 y 2 

a 2 £ 2 


(8.61) 
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where you can easily identify the constants a, ft, and 8 (Problem 8.30). This is the 
equation of a hyperbola, and we have proved that, as anticipated, the positive energy 
Kepler orbits are hyperbolas. One such orbit is shown (with the smaller dashes) in 
Figure 8.11. 


Summary of Kepler Orbits 

Our results for the Kepler orbits can be summarized as follows: All of the possible 
orbits are given by Equation (8.59), 


r(0) =---, 

1 + e cos 0 


(8.62) 


and are characterized by the two constants of integration 5 e and c. The dimensionless 
constant e is related to the comet’s energy by (8.58), 


E = 


^ 2 -l). 
2 1 2 


(8.63) 


It is, as we have seen, the eccentricity of the orbit that determines the orbit’s shape as 
follows: 


eccentricity energy orbit 

€ = 0 E < 0 circle 

0 < e < 1 E < 0 ellipse 

€ = 1 E = 0 parabola 

c > 1 E > 0 hyperbola 

You can see from (8.62) that the constant c is a scale factor that determines the size of 
the orbit. It has the dimensions of length and is the distance from sun to comet when 
0 = it j2. It is equal to f 2 /y/x or, since y is the force constant Gm ] m 2 , 

c =-, (8.64) 

Gm 1 m 2 /z 


where m l is the mass of the comet, m 2 that of the sun, and /z is the reduced mass 
/x = m 1 m 2 /(m 1 + m 2 ), which is exceedingly close to m, since m 2 is so large. 


8.8 Changes of Orbit 


In this final section, I shall discuss how a satellite can change from one orbit to another. 
For example, a spacecraft wishing to visit Venus may want to transfer from a circular 


5 Since Newton’s second law is a second-order differential equation and the motion is in two 
dimensions, there are actually four constants of integration in all. The third is the constant 8 in 
(8.46) which we chose to be zero, forcing the axis of the orbit to be the x axis. The fourth is the 
comet’s position on the orbit at time t = 0. 
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orbit close to the earth and centered on the sun to an elliptical orbit that will carry it 
to the orbit of Venus. Another example, and the one we shall discuss here, is an earth 
satellite wishing to change from one orbit about the earth to another, perhaps from a 
circular orbit to an elliptical orbit that will carry it to a higher altitude. The analysis of 
earth orbits is the same as that of orbits around the sun, except that the mass M s of the 
sun must be replaced by the mass M e of the earth, and the closest and furthest points 
from earth are called the perigee and apogee (instead of perihelion and aphelion for 
the sun). We shall confine attention to bounded, elliptical orbits, for which the most 
general form is 


r(0) = 


c 

1 + € cos(0 — 8) 


(8.65) 


(As long as we were interested in just one orbit, we could choose our x axis so that 
the angle 8 was zero. If we’re interested in two arbitrary orbits, we cannot get rid of 
8 in this way — anyway not for both.) 

Let us suppose that our spacecraft is initially in a orbit of the form (8.65) with 
energy £j, angular momentum l x and orbital paramenters c h e b and 5,. A common 
way to change orbits is for the spacecraft to fire its rockets vigorously for a brief time. 
To a good approximation we can treat this procedure as an impulse that occurs at a 
unique angle 0 O and causes an instantaneous change of velocity by a known amount. 
From the known change in velocity, we can calculate the new energy E 2 and angular 
momentum i 2 - From (8.48) we can calculate the new value of c 2 , and from (8.58) the 
new eccentricity e 2 . Finally, because the new orbit must join onto the old one at the 
angle 0 O , that is, r y (<p 0 ) = r 2 (0 o ), we can find 8 2 from the equation 


ci = _ c _i _ 

1 4- COS (0 O — 8 {) 1 + € 2 COS (0 O - 8 2 ) 


( 8 . 66 ) 


This calculation, though straightforward in principle, is tedious, and not especially 
illuminating, in practice. To simplify the calculations and to better reveal the important 
features, I shall treat just one important special case. 


A Tangential Thrust at Perigee 

Let us consider a satellite that transfers from one orbit to another by firing its rockets 
in the tangential direction, forward or backward, when it is at the perigee of its 
initial orbit. By choice of our x axis, we can arrange that this occurs in the direction 
0 = 0, so that 0 O = 0 and = 0. Moreover, because the rockets are fired in the 
tangential direction, the velocity just after firing is still in the same direction, which is 
perpendicular to the radius from earth to the satellite. Therefore, the position at which 
the rockets are fired is also the perigee for the final orbit, 6 and S 2 = 0 as well. Thus 
the equation (8.66) that assures the continuity of the orbit reduces to 


Ci _ c 2 
1 + 1 + ^2 


(8.67) 


6 Actually, it can be the perigee of the final orbit or the apogee, but we can treat both cases at 
once, as we shall see directly. 
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Let us denote by A. the ratio of the satellite’s speeds just before and just after the firing 
of the rockets, v 2 = A?;,. I shall call A the thrust factor; if A > 1, then the thrust was 
forward and the satellite sped up; if 0 < A < 1, then the thrust was backward and the 
satellite slowed down. (In principle, A could be negative, but this would represent a 
reversal of direction, an unlikely maneuver that I shan’t consider.) 

At perigee the angular momentum is just l — firv. The value of r does not 
change during the impulse, and I shall assume that the firing of the rockets changes 
the satellite’s mass by a negligible amount. Under these assumptions, the angular 
momentum changes by the same factor as the speed: 


i 2 = U !• ( 8 . 68 ) 

According to (8.48), the parameter c is proportional to l 2 . Therefore, the new value 
of c is 


c 2 = A 2 Cl . (8.69) 

Substituting into (8.67) and solving for e 2 , we find for the new eccentricity 

e 2 = A 2 *! + (A 2 - 1). (8.70) 

Equation (8.70) contains almost all the interesting information about the new orbit. 
For example, if A > 1 (a forward thrust), it is easy to see that the new orbit has c 2 > 
Thus the new orbit has the same perigee as the old one, but has greater eccentricity 
and so lies outside the old orbit, as shown by the outer dashed curve in Figure 8.12(a). 
If we make A large enough, then the new eccentricity becomes greater than 1; in this 
case the new orbit is actually a hyperbola, and our spacecraft escapes from the earth. 

If we choose the thrust factor A < 1 (a backward thrust), then the new eccentricity 
is less than the old, e 2 < €\, and the new orbit lies inside the old, as shown by the 
inner dashed curve in Figure 8.12(b). As we make A steadily smaller, eventually e 2 
vanishes; that is, if we fire the rockets backward with just the right impulse, we can 
move the satellite into a circular orbit. If we choose A still smaller, then e 2 becomes 
negative. What does this signify? The parameter € started out as a positive constant, 
but the orbital equation r = c/(l + e cos </>) makes perfectly good sense with e < 0. 



(a) Forward thrust (b) Backward thrust 


Figure 8.12 Changing orbits. The satellite’s original orbit is shown 
as the solid curve, and the rockets are fired when the satellite is at 
the perigee P. (a) A forward impulse moves the satellite to the larger 
dashed elliptical orbit, (b) A backward impulse moves the satellite to 
the smaller dashed elliptical orbit. 



Chapter 8 Two-Body Central-Force Problems 

The only difference is that the direction <f) = 0 is now the direction of maximum r and 
(j> = 7t is that of minimum r; that is, the apogee and perigee have exchanged places. 
By administering a large enough backward thrust at P (the old orbit’s perigee), we 
have transferred the satellite to a smaller orbit for which P is now the apogee. 


example 8.6 Changing between Circular Orbits 

! A satellite’s crew in a circular orbit of radius R { wishes to transfer to a circular 
j orbit of radius 2 R h It does this using two successive boosts, as shown in Figure 
j 8.13. First it boosts itself at point P into an elliptical transfer orbit 2, just large 
j enough to take it out to the required radius. Second, on reaching the required 
i radius (at P', the apogee of the transfer orbit) it boosts itself into the desired 
j circular orbit 3. By what factor must it increase its speed in each of these two 
j boosts? That is, what are the required thrust factors A and A'? By what factor 
1 does the satellite’s speed increase as a result of the whole maneuver? 

The initial circular orbit has c, = R x and eccentricity e, = 0. The final orbit 
is to have radius R 3 = 2R V According to (8.69), the transfer orbit has c 2 = X 2 R l 
| and, according to (8.70), e 2 - (A 2 — 1), where X is the thrust factor of the first 
) boost, at P. By the time the satellite reaches the point P', we want it to be at 
j radius R 3 . Since P' is the apogee of the transfer orbit, this requires that 


c 2 _ A 2 /?, _ X 2 R, 

l-e 2 1 - (A 2 - 1) 2 - A 2 


(8.71) 


which is easily solved for A to give 



(8.72) 


The satellite must boost its speed by about 15 % to move into the required transfer 
orbit. 


: 



Figure 8.13 Two successive boosts, at P and P', transfer a 
satellite from the smaller circular orbit 1 to a transfer orbit 2 
and thence to the final circular orbit 3. 
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The second transfer occurs at P', the apogee of the transfer orbit. In Problem 
8.33 you can show that the second thrust factor is 



that is, we need to boost the speed by 22% to move from the transfer orbit to the 
final circular orbit. 

It would be tempting to think that the overall change in speed, moving from 
the inital to the final orbit, was just the product XX' of the two thrust factors, 
but this overlooks that the satellite’s speed also changes as it moves around 
the transfer orbit. By conservation of angular momentum, it is easy to see that 
the speeds at the two ends of the transfer orbit satisfy u 2 (apo)/? 3 = u 2 (per)^ 1 . 
Therefore, the overall gain in speed is given by 


u 3 = X' ■ 


^(apo) a . v 

u 2 (per) 


^1 + R 3 _ ^1 _ / 2-^3 _ ^ _ //?;I _ ^ 

2 /?! ' r 3 y r { + r 3 ' 1 y r 3 ’ 1- 


(8.74) 


In the present case, R 3 = 2R ] and hence u 3 = v x /y/2. That is, the final speed is 
actually less than the initial by a factor of a/2. This result [and more generally 
the result (8.74)] could have been anticipated. It is easy to show (Problem 8.32) 
that for circular orbits v oc 1 /a/R. Thus doubling the radius necessarily required 
that the speed be reduced by a factor of a/2. 


Principal Definitions and Equations of Chapter 8 

The Relative Coordinate and Reduced Mass 

When rewritten in terms of the relative coordinate 


r = r. 


*2 


[Eq. (8.4)] 


and the CM coordinate R, the two-body problem is reduced to the problem of two 
independent particles, a free particle with mass M = m l + m 2 and position R, and a 
particle with mass equal to the reduced mass 


m 1 m 2 

m i + m 2 


[Eq. (8.11)] 


position r, and potential energy U(r). 
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The Equivalent One-Dimensional Problem 

The motion of the relative coordinate, with given angular momentum l, is equivalent 
to the motion of a particle in one (radial) dimension, with mass /x, position r (with 
0 < r < oo), and effective potential energy 

U eff (r) = U(r) + U cf (r ) = U(r) + [Eq. (8.30)] 


where U c{ is called the centrifugal potential energy. 


The Transformed Radial Equation 

With the change of variables from r to u = l/r and elimination of t in favor of 0, the 
equation of the one-dimensional radial motion becomes 

u"(0) = -«(0) - —F. [Eq. (8.41)] 

£ z u(0) z 


The Kepler Orbits 

For a planet or comet, the force is F = Gmim 2 /r 2 = y jr 1 , and the solution of (8.41) 
is 


r(0) = 


1 + € COS 0 

where c = f 2 /y/x and e is related to the energy by 


E = Y —^-{€ 2 - 1). 
2 £ 2 


[Eq. (8.49)] 


[Eq. (8.58)] 


This Kepler orbit is an ellipse, parabola, or hyperbola, according as the eccentricity 
€ is less than, equal to, or greater than 1. 


Problems for Chapter 8 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (***). 

section 8.2 CM and Relative Coordinates; Reduced Mass 

8.1 * Verify that the positions of two particles can be written in terms of the CM and relative positions 
as rj = R + m 2 r/M and r 2 = R — mpr/M. Hence confirm that the total KE of the two particles can 
be expressed as T — ]MR 2 + j/xr 2 , where /x denotes the reduced mass /x = 

8.2 ★* Although the main topic of this chapter is the motion of two particles subject to no external 
forces, many of the ideas [for example, the splitting of the Lagrangian L into two independent pieces 
L = £ cm + £ rel as in Equation (8.13)] extend easily to more general situations. To illustrate this, 
consider the following: Two masses m , and m 2 move in a uniform gravitational field g and interact 
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via a potential energy U (r). (a) Show that the Lagrangian can be decomposed as in (8.13). (b) Write 
down Lagrange’s equations for the three CM coordinates X, Y, Z and describe the motion of the CM. 
Write down the three Lagrange equations for the relative coordinates and show clearly that the motion 
of r is the same as that of a single particle of mass equal to the reduced mass /x, with position r and 
potential energy U (r). 

8.3 ** Two particles of masses m l and m 2 are joined by a massless spring of natural length L and force 
constant k. Initially, m 2 is resting on a table and I am holding m , vertically above m 2 at a height L. At 
time t = 0,1 project m ( vertically upward with initial velocity v 0 . Find the positions of the two masses 
at any subsequent t i me t (before either mass returns to the table) and describe the motion. [Hints: See 
Problem 8.2. Assume that v 0 is small enough that the two masses never collide.] 

section 8.3 The Equations of Motion 

8.4 * Using the Lagrangian (8.13) write down the three Lagrange equations for the relative coordinates 
x,y,z and show clearly that the motion of the relative position r is the same as that of a single particle 
with position r, potential energy U ( r ), and mass equal to the reduced mass /x. 

8.5 * The momentum p conjugate to the relative position r is defined with components p x = dL/dx 
and so on. Prove that p = /xr . Prove also that in the CM frame, p is the same as pj the momentum of 
particle 1 (and also —p 2 ). 

8.6 * Show that in the CM frame, the angular momentum i l of particle 1 is related to the total angular 
momentum L by l x = (m 2 /M) L and likewise i 2 = {mJM) L. Since L is conserved, this shows that 
the same is true of and l 2 separately in the CM frame. 

8.7*★ (a) Using elementary Newtonian mechanics find the period of a mass m, in a circular orbit 
of radius r around a fixed mass m 2 . (b) Using the separation into CM and relative motions, find the 
corresponding period for the case that m 2 is not fixed and the masses circle each other a constant 
distance r apart. Discuss the limit of this result if m 2 oo. (c) What would be the orbital period if 
the earth were replaced by a star of mass equal to the solar mass, in a circular orbit, with the distance 
between the sun and star equal to the present earth-sun distance? (The mass of the sun is more than 
300,000 times that of the earth.) 

8.8 ** Two masses m x and m 2 move in a plane and interact by a potential energy U (r) = \kr 2 . Write 
down their Lagrangian in terms of the CM and relative positions R and r, and find the equations of 
motion for the coordinates X, Y and x,y. Describe the motion and find the frequency of the relative 
motion. 

8.9** Consider two particles of equal masses, m l = m 2 , attached to each other by a light straight 
spring (force constant k, natural length L ) and free to slide over a frictionless horizontal table, (a) 
Write down the Lagrangian in terms of the coordinates and r 2 , and rewrite it in terms of the CM 
and relative positions, R and r, using polar coordinates (r, 0) for r. (b) Write down and solve the 
Lagrange equations for the CM coordinates X, Y. (c) Write down the Lagrange equations for r and 
0. Solve these for the two special cases that r remains constant and that 0 remains constant. Describe 
the corresponding motions. In particular, show that the frequency of oscillations in the second case is 
(o = yfZkJm[. 

8.10 ** Two particles of equal masses m l = m 2 move on a frictionless horizontal surface in the vicinity 
of a fixed force center, with potential energies U x = \kr 2 and U 2 = \kr%. In addition, they interact with 
each other via a potential energy [/ 12 = \ukr 2 , where r is the distance between them and a and k are 
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positive constants, (a) Find the Lagrangian in terms of the CM position R and the relative position 
r = |*| — r 2 . (b) Write down and solve the Lagrange equations for the CM and relative coordinates 
X, Y and x, y. Describe the motion. 

8.11** Consider two particles interacting by a Hooke’s law potential energy, U = {kr 2 , where r 
is their relative position r = rj — r 2 , and subject to no external forces. Show that r (t) describes an 
ellipse. Hence show that both particles move on similar ellipses around their common CM. [This 
is surprisingly awkward. Perhaps the simplest procedure is to choose the xy plane as the plane of 
the orbit and then solve the equation of motion (8.15) for jc and y. Your solution will have the form 
x = A cos cot + B sin cot , with a similar expression for y. If you solve these for sin cot and cos cot and 
remember that sin 2 + cos 2 = 1, you can put the orbital equation in the form ax 2 + 2 bxy + cy 2 = k 
where k is a positive constant. Now invoke the standard result that if a and c are positive and ac > b 2 , 
this equation defines an ellipse.] 

section 8.4 The Equivalent One-Dimensional Problem 

8.12 *★ (a) By examining the effective potential energy (8.32) find the radius at which a planet (or 
comet) with angular momentum i can orbit the sun in a circular orbit with fixed radius. [Look at 
dU ef( /dr.] (b) Show that this circular orbit is stable, in the sense that a small radial nudge will cause 
only small radial oscillations. [Look at d 2 U efi /dr 2 .\ Show that the period of these oscillations is equal 
to the planet’s orbital period. 

8.13 Two particles whose reduced mass is /i interact via a potential energy U = | kr 2 , where r is 
the distance between them, (a) Make a sketch showing U (r), the centrifugal potential energy U cf (r), 
and the effective potential energy /7 eff (r). (Treat the angular momentum £ as a known, fixed constant.) 
(b) Find the “equilibrium” separation r 0 , the distance at which the two particles can circle each other 
with constant r. [Hint: This requires that d U efi /dr be zero.] (c) By making a Taylor expansion of U ef[ (r) 
about the equilibrium point r 0 and neglecting all terms in {r — r 0 ) 3 and higher, find the frequency of 
small oscillations about the circular orbit if the particles are disturbed a little from the separation r Q . 

8.14 *** Consider a particle of reduced mass /j, orbiting in a central force with U = kr n where kn > 0. 
(a) Explain what the condition kn > 0 tells us about the force. Sketch the effective potential energy 
£/ eff for the cases that n = 2,-1, and —3. (b) Find the radius at which the particle (with given angular 
momentum i) can orbit at a fixed radius. For what values of n is this circular orbit stable? Do your 
sketches confirm this conclusion? (c) For the stable case, show that the period of small oscillations 
about the circular orbit is t osc = r orb / y/n + 2. Argue that if y/n + 2 is a rational number, these orbits 
are closed. Sketch them for the cases that n = 2,-1, and 7. 

section 8.6 The Kepler Orbits 

8.15 * In deriving Kepler’s third law (8.55) we made an approximation based on the fact that the 
sun’s mass M s is much greater than that of the planet m. Show that the law should actually read 
t 2 = [4n 2 /G(M s + m)]a 3 , and hence that the “constant” of proportionality is actually a little different 
for different planets. Given that the mass of the heaviest planet (Jupiter) is about 2 x 10 27 kg, while 
M s is about 2 x 10 30 kg (and some planets have masses several orders of magnitude less than Jupiter), 
by what percent would you expect the “constant” in Kepler’s third law to vary among the planets? 

8.16 ** We have proved in (8.49) that any Kepler orbit can be written in the form r(cp) = c/(\ + 
e cos cf>), where c > 0 and e > 0. For the case that 0 < € < 1, rewrite this equation in rectangular 
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coordinates (x, >’) and prove that the equation can be cast in the form (8.51), which is the equation 
of an ellipse. Verify the values of the constants given in (8.52). 

8.17 *★ If you did Problem 4.41 you met the virial theorem for a circular orbit of a particle in a central 
force with U = kr n . Here is a more general form of the theorem that applies to any periodic orbit of 
a particle, (a) Find the time derivative of the quantity G = r • p and, by integrating from time 0 to t, 
show that 


Gjt) - G(0) 

t 


= 2(T) + (F-r) 


where F is the net force on the particle and (/) denotes the average over time of any quantity /. 

(b) Explain why, if the particle’s orbit is periodic and if we make t sufficiently large, we can make the 
left-hand side of this equation as small as we please. That is, the left side approaches zero as t -* oo. 

(c) Use this result to prove that if F comes from the potential energy U = kr n , then (T) = n ( U)/2 , if 
now (/) denotes the time average over a very long time. 

8.18 ** An earth satellite is observed at perigee to be 250 km above the earth’s surface and traveling 
at about 8500 m/s. Find the eccentricity of its orbit and its height above the earth at apogee. [Hint: 
The earth’s radius is R e ~ 6.4 x 10 6 m. You will also need to know GM e , but you can find this if you 
remember that GMJR e 2 = g.] 

8.19 ** The height of a satellite at perigee is 300 km above the earth’s surface and it is 3000 km at 
apogee. Find the orbit’s eccentricity. If we take the orbit to define the xy plane and the major axis in 
the v direction with the earth at the origin, what is the satellite’s height when it crosses the y axis? [See 
the hint for Problem 8.18.] 


8.20 ** Consider a comet which passes through its aphelion at a distance r max from the sun. Imagine 
that, keeping r max fixed, we somehow make the angular momentum l smaller and smaller, though 
not actually zero; that is, we let £ —> 0. Use equations (8.48) and (8.50) to show that in this limit 
the eccentricity e of the elliptical orbit approaches 1 and that the distance of closest approach r min 
approaches zero. Describe the orbit with r max fixed but £ very small. What is the semimajor axis a? 

8.21 **★ (a) If you haven’t already done so, do Problem 8.20. (b) Use Kepler’s third law (8.55) to find 
the period of this orbit in terms of r max (and G and M s ). (c) Now consider the extreme case that the 
comet is released from rest at a distance r max from the sun. (In this case £ is actually zero.) Use the 
technique described in connection with (4.58) to find how long the comet takes to reach the sun. (Take 
the sun’s radius to be zero.) (d) Assuming the comet can somehow pass freely through the sun, describe 
its overall motion and find its period, (e) Compare your answers in parts (b) and (d). 

8.22*** A particle of mass m moves with angular momentum £ about a fixed force center with 
F(r) = k/r 3 where k can be positive or negative, (a) Sketch the effective potential energy f/ eff for 
various values of k and describe the various possible kinds of orbit, (b) Write down and solve the 
transformed radial equation (8.41), and use your solutions to confirm your predictions in part (a). 

8.23 *** A particle of mass m moves with angular momentum £ in the field of a fixed force center with 



where k and X are positive, (a) Write down the transformed radial equation (8.41) and prove that the 
orbit has the form 


1 + € cos(j60) 
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where c, p, and e are positive constants, (b) Find c and fi in terms of the given parameters, and describe 
the orbit for the case that 0 < e < 1. (c) For what values of p is the orbit closed? What happens to your 
results as k —> 0? 

8.24 *** Consider the particle of Problem 8.23, but suppose that the constant k is negative. Write down 
the transformed radial equation (8.41) and describe the orbits of low angular momentum (specifically, 
l 2 < -km). 

8.25 ★★★ [Computer] Consider a particle with mass m and angular momentum l in the field of a central 
force F = — k/r 5 / 2 . To simplify your equations, choose units for which m = l = k = 1. (a) Find the 
value r 0 of r at which £/ eff is minimum and make a plot of U eff (r) for 0 < r < 5 r 0 . (Choose your 
scale so that your plot shows the interesting part of the curve.) (b) Assuming now that the particle has 
energy E = —0.1, find an accurate value of r min , the particle’s distance of closest approach to the force 
center. (This will require the use of a computer program to solve the relevant equation numerically.) 
(c) Assuming that the particle is at r = r min when 0 = 0, use a computer program (such as “NDSolve” 
in Mathematica) to solve the transformed radial equation (8.41) and find the orbit in the form r = r(0) 
for 0 < 0 < In. Plot the orbit. Does it appear to be closed? 

8.26 **★ Show that the validity of Kepler’s first two laws for any body orbiting the sun implies that 
the force (assumed conservative) of the sun on any body is central and proportional to 1/r 2 . 

8.27 ★** At time t 0 a comet is observed at radius r 0 traveling with speed v 0 at an acute angle a to the 
line from the comet to the sun. Put the sun at the origin O, with the comet on the x axis (at t 0 ) and its 
orbit in the xy plane, and then show how you could calculate the parameters of the orbital equation 
in the form r = c/[l + e cos(0 — 6)]. Do so for the case that r 0 = 1.0 x 10 11 m, v 0 = 45 km/s, and 
a = 50 degrees. [The sun’s mass is about 2.0 x 10 30 kg, and G = 6.7 x 10 -11 N-m 2 /s 2 .] 


section 8.7 The Unbounded Kepler Orbits 

8.28* For a given earth satellite with given angular momentum l, show that the distance of closest 
approach r min on a parabolic orbit is half the radius of the circular orbit. 

8.29 ** What would become of the earth’s orbit (which you may consider to be a circle) if half of the 
sun’s mass were suddenly to disappear? Would the earth remain bound to the sun? [Hints: Consider 
what happens to the earth’s BCE and PE at the moment of the great disappearance. The virial theorem 
for the circular orbit (Problem 4.41) helps with this one.] Treat the sun (or what remains of it) as 
fixed. 

8.30 ** The general Kepler orbit is given in polar coordinates by (8.49). Rewrite this in Cartesian 
coordinates for the cases that € = 1 and e > 1. Show that if e = 1, you get the parabola (8.60), and if 
e > 1, the hyperbola (8.61). For the latter, identify the constants a, P, and 8 in terms of c and e. 

8.31 **★ Consider the motion of two particles subject to a repulsive inverse-square force (for example, 
two positive charges). Show that this system has no states with E < 0 (as measured in the CM frame), 
and that in all states with E > 0, the relative motion follows a hyperbola. Sketch a typical orbit. [Hint: 
You can follow closely the analysis of Sections 8.6 and 8.7 except that you must reverse the force; 
probably the simplest way to do this is to change the sign of y in (8.44) and all subsequent equations 
(so that F(r) = +y/r 2 ) and then keep y itself positive. Assume i ^ 0.] 
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section 8.8 Changes of Orbit 

8.32 * Prove that for circular orbits around a given gravitational force center (such as the sun) the speed 
of the orbiting body is inversely proportional to the square root of the orbital radius. 

8.33 ** Figure 8.13 shows a space vehicle boosting from a circular orbit 1 at P to a transfer orbit 2 
and then from the transfer orbit at P' to the final circular orbit 3. Example 8.6 derived in detail 
the thrust factor required for the boost at P. Show similarly that the thrust factor required at P' is 
X' = y/(Ri + R 3 )/2R l . [Your argument should parallel closely that of Example 8.6, but you must 
account for the fact that P' is the apogee (not perigee) of the transfer orbit. For example, the plus 
signs in (8.67) should be minus signs here.] 

8.34 ** Suppose that we decide to send a spacecraft to Neptune, using the simple transfer described in 
Example 8.6 (page 318). The craft starts in a circular orbit close to the earth (radius 1 AU or astronomical 
unit) and is to end up in a circular orbit near Neptune (radius about 30 AU). Use Kepler’s third law to 
show that the transfer will take about 31 years. (In practice we can do a lot better than this by arranging 
that the craft gets a gravitational boost as it passes Jupiter.) 

8.35 *** A spacecraft in a circular orbit wishes to transfer to another circular orbit of quarter the 
radius by means of a tangential thrust to move into an elliptical orbit and a second tangential thrust at 
the opposite end of the ellipse to move into the desired circular orbit. (The picture looks like Figure 
8.13 but run backwards.) Find the thrust factors required and show that the speed in the final orbit is 
two times greater than the initial speed. 




CHAPTER 


Mechanics in Noninertial Frames 


We saw in Chapter 1 that Newton’s laws are valid only in the special class of inertial 
reference frames — frames that are neither accelerating nor rotating. The natural 
reaction to this realization is to resolve to treat all mechanics problems using only 
inertial frames, and this is in fact what we have done so far. Nevertheless, there are 
situations where it is very desirable to consider a noninertial frame. For example, if you 
are sitting in a car that is accelerating and you wish to describe the behavior of a coin 
that you are tossing in the air, it is very natural to want to describe it as seen by you , that 
is, in the accelerating frame of the car in which you are traveling. Another important 
example is a reference frame fixed to the earth. To an excellent approximation this 
is an inertial frame. Nevertheless, the earth is rotating on its axis and accelerating in 
its orbit, and, for both these reasons, a reference frame fixed to the earth is not quite 
inertial. In the majority of problems, the noninertial character of a frame fixed to the 
earth is totally negligible, but there are situations, for instance the firing of a long- 
range rocket, in which the earth’s rotation has important consequences. Naturally, we 
would like to describe the motion relative to the earth-bound frame that we live in, 
but to do so, we must learn to do mechanics in noninertial frames. 

In the first two sections of this chapter, I shall describe the simple case of a reference 
frame that is accelerating but not rotating. In the remainder of the chapter, I shall 
discuss the case of a rotating reference frame. 


9.1 Acceleration without Rotation 


Let us consider an inertial frame S 0 and a second frame § that is accelerating relative 
to S 0 with acceleration A, which need not necessarily be constant. Notice that because 
the noninertial frame is the one we’re really interested in, it’s the one that I’ve called 
§. Also I’m using capital letters for the acceleration (and velocity) of frame S relative 
to frame S 0 . The inertial frame S 0 could be a frame anchored to the ground. (We’ll 
ignore the tiny acceleration of any earth-bound frame for now.) The frame S could 327 
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be a frame fixed in a railroad car that is moving relative to S 0 , with velocity V and 
acceleration A = V. Suppose now that a passenger in the car is playing catch with a 
tennis ball of mass m, and let us consider the motion of the ball, first as measured 
relative to S 0 . Because S 0 is inertial, we know that Newton’s second law holds, so 

mr 0 = F, (9.1) 

where r 0 is the ball’s position relative to S 0 and F is the net force on the ball, the vector 
sum of all forces on the ball (gravity, air resistance, the passenger’s hand pushing it, 
etc.). 

Now consider the same ball’s motion as measured relative to the accelerating frame 
8. The ball’s position relative to 8 is r and, by the vector addition of velocities, its 
velocity r relative to 8 is related to r 0 by the velocity-addition formula: 1 

r 0 = r + V (9.2) 

that is, 

(ball’s velocity relative to ground) 

= (ball’s velocity relative to car) + (car’s velocity relative to ground). 
Differentiating and rearranging, we find that 

r = r 0 -A. (9.3) 

If we multiply this equation by m and use (9.1) to replace mr 0 by F, we find that 

mr = F - mA. (9.4) 

This equation has exactly the form of Newton’s second law, except that in addition to 
F, the sum of all forces identified in the inertial frame, there is an extra term on the 
right equal to — mA. This means we can continue to use Newton’s second law in the 
noninertial frame 8 provided we agree that in the noninertial frame we must add an 
extra force-like term, often called the inertial force: 



This inertial force experienced in noninertial frames is familiar in several everyday 
situations: If you sit in an aircraft accelerating rapidly toward takeoff, then, from your 
point of view, there is a force that pushes you back into your seat. If you are standing 
in a bus that brakes suddenly (A backward) the inertial force —mA is forward and 
can make you fall on your face if you aren’t properly braced. As a car goes rapidly 


1 An important discovery of relativity is, of course, that velocities do not combine exactly ac¬ 
cording to the simple vector addition of (9.2); nevertheless, in the framework of classical mechanics 
(9.2) is correct. In the same way, I shall take for granted that the times measured in S and S 0 are the 
same, which is correct in classical, though not relativistic, mechanics. 
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around a sharp curve, the inertial force experienced by the passengers is the so-called 
centrifugal force that pushes them outward. One can take the view that the inertial force 
is a “fictitious” force, introduced merely to preserve the form of Newton’s second law. 
Nevertheless, for an observer in an accelerating frame, it is entirely real. 

In many problems involving objects in accelerated frames, much the simplest 
procedure is to go ahead and use Newton’s second law in the noninertial frame, always 
remembering to include the extra inertial force as in (9.4). Here is a simple example: 


example 9.1 A Pendulum in an Accelerating Car 

Consider a simple pendulum (mass m and length L) mounted inside a railroad 
car that is accelerating to the right with constant acceleration A as shown in 
Figure 9.1. Find the angle 0 eq at which the pendulum will remain at rest relative 
to the accelerating car and find the frequency of small oscillations about this 
equilibrium angle. 

As observed in any inertial frame, there are just two forces on the bob, the 
tension in the string T and the weight mg; thus the net force (in any inertial 
frame) is F = T + mg. If we choose to work in the noninertial frame of the 
accelerating car, there is also the inertial force — m A, and the equation of motion 
(9.4) is 

mr = T + mg — mA. (9.6) 

A remarkable simplification in this problem is that, because the weight mg and 
the inertial force — mA are both proportional to the mass m, we can combine 
these two terms and write 

mr = T + m(g — A) 

= T + mg eff (9.7) 

where g eff = g — A. We see that the pendulum’s equation of motion in the 
accelerating frame of the car is exactly the same as in an inertial frame, except 


-A 


Figure 9.1 A pendulum is suspended from the roof of a railroad 
car that is accelerating with constant acceleration A. In the non¬ 
inertial frame of the car, the acceleration manifests itself through 
the inertial force — mA, which, in turn, is equivalent to the re¬ 
placement of g by the effective g eff = g — A. 
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that g has been replaced by an effective g eff = g — A, as shown in Figure 9.1. 2 
This makes the solution of our problem almost trivially simple. 

If the pendulum is to remain at rest (as seen in the car), then r must be zero, 
and according to (9.7) T must be exactly opposite to mg eff . In particular, we see 
from Figure 9.1 that the direction of T (and hence of the pendulum) has to be 

<P eq = arctan(A/g). (9.8) 

The small-amplitude frequency of a pendulum in an inertial frame is well known 
to be oo — ffg/L. Thus the period of our pendulum is obtained by replacing g 
by geff = y/g 2 + A 2 . That is, 

(9 . 9 ) 

It is worth comparing this solution with other, more direct methods. First we 
could certainly have found the equilibrium angle (9.8) working in an inertial 
frame anchored to the ground. In this frame, there are only two forces on the 
bob (T and mg). If the pendulum is to remain at rest in the railroad car, then 
(as seen from the ground) it must accelerate at exactly A. Therefore the net 
force T + mg must equal mA, and by drawing a triangle of forces you can 
easily convince yourself that this requires T to have the direction (9.8). On the 
other hand, to find the frequency of oscillations (9.9) directly in the ground- 
based frame requires considerable ingenuity, and does not give the insight of 
our noninertial derivation. 

We could also have derived both results using the Lagrangian method. This 
has the distinct advantage that you don’t have to think about inertial and non¬ 
inertial frames —just write down the Lagrangian L in terms of the generalized 
coordinate 4> and then crank the handle. However, finding the frequency (9.9) 
this way is quite clumsy, as you discovered if you did Problem 7.30. 


9.2 The Tides 


A beautiful application of the result (9.4) is the explanation of the tides. As you 
probably know, the tides are the result of the bulges in the earth’s oceans caused by 
the gravitational attraction of the moon and sun. As the earth rotates, people on the 
earth’s surface move past these bulges and experience a rising and falling of the sea’s 
level. It turns out that the most important contributor to this effect is the moon, and to 
simplify our discussion I shall, at first, ignore the sun entirely. I shall also assume, at 
first, that the oceans cover the whole surface of the globe. 


2 This result, that the effect of being in an accelerated frame is the same as having an addi¬ 
tional gravitational force, is a cornerstone of general relativity, where it is called the principle of 
equivalence. 
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(a) Wrong (a) Right 


Figure 9.2 Views of the earth and moon from high above the North Pole, with the 
earth rotating counterclockwise about the polar axis, (a) A plausible-sounding, 
but incorrect, explanation of the tides argues that the attraction of the moon causes 
the oceans to bulge toward the moon. As the earth rotates once a day, a person 
fixed on the earth would experience one high tide per day, not the observed two. 
(b) The correct explanation: The main effect of the moon’s attraction is to give 
the whole earth (including the oceans) a small acceleration A toward the moon. 
As explained in the text, the residual effect is that the oceans bulge both toward 
and away from the moon as shown. As the earth rotates once a day, these two 
bulges cause the observed two high tides per day. 


A plausible-sounding, though entirely incorrect, explanation of the tides is illus¬ 
trated in Figure 9.2(a). According to this incorrect argument, the moon’s attraction 
pulls the oceans toward the moon, producing a single bulge toward the moon. The 
trouble with this argument (beside that it is wrong) is that a single bulge would cause 
just one high tide per day, instead of the observed two. 

The correct explanation, illustrated in Figure 9.2(b), is more complicated. The 
dominant effect of the moon is to give the whole earth, including the oceans, an 
acceleration A toward the moon. This acceleration is the centripetal acceleration of the 
earth as the moon and earth circle around their common center of mass and is (almost 
exactly) the same as if all the masses that make up the earth were concentrated at its 
center. This centripetal acceleration of any object on earth, as it orbits with the earth, 
corresponds to the pull of the moon that the object would feel at the earth’s center. 
Now any object on the moon side of the earth is pulled by the moon with a force that 
is slightly greater than it would be at the center. Therefore, as seen from the earth, 
objects on the side nearest the moon behave as if they felt a slight additional attraction 
toward the moon. In particular, the ocean surface bulges toward the moon. On the other 
hand, objects on the far side from the moon are pulled by the moon with a force that 
is slightly weaker than it would be at the center, which means that they move (relative 
to the earth) as if they were slightly repelled by the moon. This slight repulsion causes 
the ocean to bulge on the side away from the moon and is responsible for the second 
high tide of each day. 

We can make this argument quantitative (and probably more convincing) if we 
return to Equation (9.4). The forces on any mass m near the earth’s surface are (1) the 
gravitational pull, mg, of the earth, (2) the gravitational pull, —GM m md/d 2 , of the 
moon (where M m is the mass of the moon and d is the position of the object relative to 
the moon, as in Figure 9.3.), and (3) the net non-gravitational force F ng (for instance, 
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Figure 9.3 A mass m near the earth’s surface has position r relative to 
the earth’s center, and d relative to the moon. The vector d 0 is the position 
of the earth’s center relative to the moon’s. 


the buoyant force on a drop of sea water in the ocean). Meanwhile the acceleration of 
our origin O, the earth’s center, is 



where d 0 is the position of the earth’s center relative to the moon. Putting all this 
together we find for (9.4) 

mr = F — mA (9.10) 

= - GM m m + F ng j + GM m m (9.11) 

or, if we combine the two terms that involve M m , 


mr = mg + F tid + F ng 


where the tidal force 



(9.12) 


is the difference between the actual force of the moon on m and the corresponding 
force if m were at the center of the earth. 

The entire effect of the moon on the motion (relative to the earth) of any object 
near the earth is contained in the tidal force F tid of (9.12). At the point directly facing 
the moon, point P in Figure 9.4, the vectors d = MP and d 0 = MO point in the same 
direction, but d < d Q . Thus the first term in (9.12) dominates, and the tidal force is 
toward the moon. At point R, facing directly away from the moon, again d (equal to 
MR ) and d 0 point in the same direction, but here d > d Q and the tidal force points 
away from the moon. At the point Q, the vectors d = MQ and d 0 = MO point in 
different directions; the x components of the two terms in (9.12) cancel almost exactly, 
but only the first term has a y component. Thus at Q (and likewise at S ) the tidal force 
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Figure 9.4 The tidal force F tjd as given by (9.12) is outward (away from 
the earth’s center) at points P and R, but inward (toward the earth’s 
center) at Q and S . 


is inward, toward the earth’s center, as shown in Figure 9.4. In particular, the effect 
of the tidal force is to distort the ocean into the shape shown in Figure 9.2(b), with 
bulges centered on the points P and R, giving the observed two high tides per day. 


Magnitude of the Tides 

The simplest way to find the height difference between high and low tides is to 
observe that the surface of the ocean is an equipotential surface — a surface of constant 
potential energy. To prove this, consider a drop of sea water on the surface of the ocean. 
The drop is in equilibrium (relative to the earth’s reference frame) under the influence 
of three forces: the earth’s gravitational pull mg, the tidal force F tid , and the pressure 
force F p of the surrounding sea water. Since a static fluid cannot exert any shearing 
force, the pressure force F p must be normal to the surface of the ocean. (It is in fact just 
the buoyant force of Archimedes’ principle.) Since the drop of water is in equilibrium, 
it follows that mg + F tid must likewise be normal to the surface. 

Now, both mg and F tid are conservative, so each can be written as the gradient of 
a potential energy: 


mg = -Vt/ eg and F tid = -V£/ tid 

where U eg is the potential energy due to the earth’s gravity and U dd is that of the tidal 
force, which is, by inspection of (9.12), 3 


f/ tid = -GM m rnQ + ^j. (9.13) 


3 According to (9.12), F tid is the sum of two terms. The first is just the usual inverse square force 
with corresponding potential energy —GM m m /d and the second is a constant vector pointing in the 
x direction, which gives the term —GM m mx/d£. 
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Figure 9.5 The difference h between high and low tides is the difference between the lengths 
OP and OQ. This difference is much exaggerated here (since h turns out to be of order 1 meter), 
and both the lengths OP and OQ are very close to the radius of the earth, R t ~ 6400 km. Also, 
r & R e is actually about 60 times smaller than the earth-moon distance OM = d 0 . 


Thus, the statement that mg + F tid is normal to the surface of the ocean can be 
rephrased to say that V(£/ eg + U tid ) is normal to the surface, which in turn implies 
that U = (£/ eg + U tid ) is constant on the surface. In other words, the surface of the 
ocean is an equipotential surface. 

Since U is constant on the surface, it follows that 

U(P) = U{Q ) 


(see Figure 9.5) or 


t/ eg (P) - U eg (Q) = U tid (Q ) - U tid (P). (9.14) 


The left side here is just 


U eg (P) — U eg (Q) = mgh 


(9.15) 


where h is the required difference between high and low tides (the difference between 
the lengths OP and OQ in Figure 9.5). To find the right side of (9.14) we must evaluate 
the two tidal potential energies, U tid (Q) and U tid (P) from the definition (9.13). At 
point Q, we see that d — + r 2 (with r ~ R e ) and x = 0. Therefore, from (9.13), 

U m (Q) = —GM m m—j=L= . 

A 2 +*,? 

We can rewrite the square root as ^jd£ + i? e 2 = d 0 ^\ + (R c /d 0 ) 2 . Then, since 
R s /d 0 <$C 1, we can use the binomial approximation (1 + e)~ 1/2 « 1 — ^ to give 


C4d (Q) ** - 


GM m m 



(9.16) 
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At the point P, we see that d = d 0 — R e and x = —R e , and a similar calculation gives 
(Problem 9.5) 

(9.17) 

(As you can check, one gets the same answer at the point R facing away from the 
moon, so that, in this approximation, the heights of the two daily tides should be the 
same.) 

Substituting (9.15), (9.16), and (9.17) into (9.14) we get 


mgh = 


GM m m 3 R e 2 
d 0 2 d 0 2 ' 


If we recall that g — GM e /R 2 , this implies that 
3 M m R* 
~ 2M e d 0 3 ' 


(9.18) 


Putting in the numbers (M m = 7.35 x 10 22 kg, M e = 5.98 x 10 24 kg, R e = 6.37x 
10 6 m, and d 0 = 3.84 x 10 8 m), we find for the height of the tides, due to the moon 
alone, 


h = 54 cm [moon alone]. (9.19) 

The height of the tides caused by the sun alone is also given by (9.18), but with 
M m replaced by the mass of the sun, M s = 1.99 x 10 30 kg, and d 0 replaced by the 
distance from the sun to the earth, 1.495 x 10 11 m. This gives 

h = 25 cm [sun alone]. (9.20) 

Although the sun’s contribution to the tides is less than the moon’s, it is certainly 
not neglible, and the two effects combine in an interesting way. First, consider a time 
when the earth, sun, and moon are approximately in line — either with the earth in the 
middle, as at full moon, or the moon in the middle as at new moon. (See Figure 9.6.) 
In this case the tidal forces due to the moon and sun reinforce each other (since the 
two bulges caused by the moon coincide with the two caused by the sun); thus, we 
would predict large tides (known as spring tides ) with h given by the sum of (9.19) 
and (9.20), that is, h = 54 + 25 = 79 cm. On the other hand, if the sun, earth, and 
moon form a right angle, then the two tidal effects will cancel and we would predict 
smaller tides (known as neap tides) with height h — 54 — 25 = 29 cm. 

Although the theory just presented is basically correct, especially in the middle of 
the larger oceans, the real situation involves many intriguing complications. Perhaps 
the most important complication is the effect of the continental land masses. So far I 
have pretended that the oceans cover the whole world, allowing the tidal forces of the 
moon and sun to collect the two bulges of Figure 9.5. But the presence of continents 
can affect our conclusions, leading sometimes to smaller and sometimes to larger tides. 
A small sea, such as the Black Sea or even the Mediteranean, that is shut off from the 
main oceans by land will obviously exhibit much smaller tides than we have calculated 
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small tides 


Figure 9.6 Four successive positions of the moon in its monthly orbit around the 
earth. At new moon and full moon, the tidal effects of the moon and sun reinforce 
each other and the tides are large (“spring” tides). At half moon, the two effects 
tend to cancel and the tides are small (“neap” tides). 


here. On the other hand, the tides moving across a large ocean can be blocked by the 
bordering continents and can build up to much greater heights. And tides entering 
narrow tapering estuaries can cause quite dramatic “tidal bores.” 


9.3 The Angular Velocity Vector 


In the remainder of this chapter, we shall be discussing the motion of objects as seen 
in reference frames that are rotating (relative to inertial frames). Before we begin this 
discussion, I must introduce some concepts and notation for handling rotations. A 
detailed study of rotations is actually surprisingly complicated. Fortunately, we do 
not need many of the details, and some of the properties that are quite hard to prove 
are reasonably plausible and can be stated without proof. 

The rotating axes with which we shall be concerned are almost always axes fixed 
in a rigid body. The most important example is a set of axes fixed in the rotating earth, 
but we shall see several other examples in Chapter 10. In discussing the rotation of a 
rigid body, there are really just two situations that concern us: Sometimes the body is 
rotating about a point of the body that is fixed (in some inertial frame); for example, 
a wheel that is spinning about a fixed axle, or a pendulum swinging about a fixed 
pivot. If the rotating body has no fixed point (for example, a baseball that is spinning 
as it flies through the air), then we usually proceed in two steps: First, we find the 
motion of the center of mass, and then we analyze the rotational motion of the body 
relative to its CM. As soon as we restrict attention to the motion relative to the CM, 
we are in effect examining the motion in a reference frame in which the CM is fixed. 
Thus either way, our discussion of a rotating body concerns a body with one point 
effectively fixed. 

The crucial result concerning a body rotating about a fixed point is called Euler’s 
theorem and states that the most general motion of any body relative to a fixed point 
O is a rotation about some axis through O . Although this theorem is quite complicated 
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to prove, the result seems very natural and I hope you can accept it without proof. 4 It 
implies that to specify a rotation about a given point O, we need give only the direction 
of the axis about which the rotation occurred and the angle through which the body 
rotated. Here, our concern is more with the rate of rotation, or angular velocity, and 
Euler’s theorem implies that this can be specified by giving the direction of the axis of 
rotation and the rate of rotation about this axis. The direction of the axis of rotation can 
be specified by a unit vector u and the rate by the number co = dd/dt. For example, 
a merry-go-round could be rotating about a vertical axis (u points vertically) at a rate 
of 10 rad/min (co = 10 rad/min). 

It is often convenient to combine the unit vector u with co to form an angular 
velocity vector 


co = cou. (9.21) 

The single vector co specifies both the direction of the axis of rotation (namely, u, the 
direction of co) and the rate of rotation (namely, co, the magnitude of co). Actually we 
have not yet quite defined a unique vector co. For example, for the merry-go-round that 
is rotating about a vertical axis, the vector co points vertically, but does it point up or 
down? We remove this ambiguity using the right-hand rule: We choose the direction 
of co so that when our right thumb points along co, our right fingers curl in the direction 
of rotation. Another way to say this is that if you look along the vector co, you will 
see the body rotating clockwise. 

It is important to recognize that the angular velocity can change with time. If the 
speed of rotation is changing, then co will be changing in magnitude, and if the axis of 
rotation is changing, then co will be changing in direction. For example, the angular 
velocity of a spacecraft that is tumbling out of control will usually change in both 
magnitude and direction. In this case co = co(t) is the instantaneous angular velocity 
at the time t. On the other hand, there are many interesting situations where co is 
constant (in magnitude and direction); for example, this is true (to an outstanding 
approximation) of the angular velocity of the earth spinning on its axis. 


A Useful Relation 

There is a useful relation between the angular velocity of a rigid body and the linear 
velocity of any point in the body. Consider, for example, the earth, rotating with 
angular velocity co about its center O (which I shall take to be stationary for the present 
discussion). Next, consider any point P fixed on (or in) the earth, for example the top 
of Mount Everest, with position r relative to O. We can specify r by its spherical 
polar coordinates (r, 6,4>) with z axis pointing through the North Pole, so that 6 is the 
colatitude — the latitude measured down from the North Pole (instead of up from the 
equator, as is more usual with geographers). As the earth turns on its axis, the point P 
is dragged in an easterly direction around a circle of latitude, with radius p = r sin 9, 
as shown in Figure 9.7. This means that P moves with speed v — cor sin 6, and if you 


4 You can find a proof in, for example, Classical Mechanics by Herbert Goldstein, Charles Poole, 
and John Safco (3rd ed., Addison-Wesley, 2002), Section 4-6. 
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Figure 9.7 The earth’s rotation drags the point P on the surface 
around a circle of latitude (radius p = r sin 9) with speed v = 
cop = cor sin 0 and hence velocity v = <w x r. 


will check the direction in Figure 9.7, you will see that the vector velocity is o> x r. 
You can easily see that this result is independent of the nature of the rotating body; 
that is, for any rigid body rotating with angular velocity co about an axis through O, 
the velocity of any point P (position r) fixed on the body is 


v = <a x r. 


(9.22) 


This useful relation is, of course, a generalization of the familiar relation v = cor that 
you learned in introductory physics for the speed of a point on the perimeter of a wheel 
of radius r. It is perhaps worth emphasizing that there is a corresponding relation for 
any vector fixed in the rotating body. For example, if e is a unit vector fixed in the 
body, then its rate of change, as seen from the non-rotating frame, is 


de 

— = co x e, 
dt 


(9.23) 


a result we shall be using shortly. 


Addition of Angular Velocites 

A final basic property of angular velocities that is worth mentioning is that relative 
angular velocities add in the same way as relative translational velocities. We know 
(in the framework of classical mechanics) that if two frames 2 and 1 have relative 
velocity v 2 i and if a body 3 has velocity v 32 relative to frame 2, then the velocity of 3 
relative to frame 1 is just the sum 


V 31 — V 32 + v 21- 


( 9 . 24 ) 
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Suppose instead that frame 2 is rotating with angular velocity to 21 relative to frame 
1 (both frames with the same origin O) and that body 3 is rotating (about O) with 
angular velocities o> 31 and to 32 relative to frames 1 and 2. Now consider any point 
r fixed in body 3. Its translational velocities relative to frames 1 and 2 must satisfy 
(9.24). According to (9.22) this means that 


o> 31 x r = (to 32 x r) + (to 2 i x r) = (o> 32 + o> 21 ) x r 


and, since this must be true for any r, it follows that 


to 3 ] — to 32 + to 2 i- 


(9.25) 


That is, angular velocities add in the same way as translational velocities. 


Notation for Angular Velocities 

In labeling angular velocities, I shall usually observe the following convention: I shall 
use the lower case letter to for the angular velocity of a body (such as a spinning top) 
whose motion is our primary object of interest. I shall use the capital letter S2 for the 
angular velocity of a noninertial, rotating reference frame relative to which we are 
calculating the motion of one or more objects. This distinction is consistent with the 
previous two sections, where I used capital A and V for the acceleration and velocity 
of a noninertial frame (relative to an inertial frame). In practice, to usually denotes 
an unknown, while ft is usually a given, known angular velocity, such as the angular 
velocity of the earth as it rotates once a day. In the remainder of this chapter, we shall 
be concerned with the motion of objects as seen in a rotating reference frame, and, 
in accordance with this convention, I shall denote the angular velocity of that frame 
by ft. 


9.4 Time Derivatives in a Rotating Frame 


We are now ready to consider the equations of motion for an object that is viewed 
from a frame S that is rotating with angular velocity ft relative to an inertial frame 
S 0 . While our conclusions will apply to any rotating frame, by far the most important 
example is a frame attached to the rotating earth, and this is the example that you can 
keep in mind. This being the case, let us pause to calculate the angular velocity of the 
earth, which rotates on its axis once every 24 hours. 5 Therefore, for a frame attached 
to the earth 


ft = 


In rad 
24 x 3600 s 


^ 7.3 x 1(T 5 rad/s. 


(9.26) 


5 Strictly speaking the period of rotation about the earth’s axis is one sidereal day, the time to 
rotate once relative to the distant stars. This is shorter than the solar day by a factor of 365/366, but 
the difference is too small to worry about here. 
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Figure 9.8 The frame S 0 defined by the three dashed axes is inertial. 
The frame S defined by the three solid axes shares the same origin 
O, but is rotating with angular velocity ft relative to S 0 . 


It is because this angular velocity is so small that we can often ignore it entirely. 
Nevertheless, we shall see that the earth’s rotation does have measurable effects on 
the motion of projectiles, pendulums, and other systems. There are other noninertial 
effects (notably the tides), associated with the orbital motion of the earth and moon, 
but these are all much less important for the problems that we shall consider here, and 
I shall ignore them for now. 

I shall assume that the two frames S 0 and 8 share a common origin O, as shown 
in Figure 9.8, so that the only motion of 8 relative to S 0 is a rotation with angular 
velocity ft. For example, the common origin O could be the center of the earth, 8 
could be a set of axes fixed in the earth, and S 0 a set of axes with the same origin but 
with directions fixed relative to the distant stars. The frame 8 is convenient for us to 
use but is noninertial; the frame S 0 is relatively inconvenient, but is inertial. 

Let us now consider an arbitrary vector Q. This could be the velocity or position 
of a ball, the net force on an object, or any other vector of interest. Our first task is to 
relate the time rate of change of Q as measured in frame 8 0 to the corresponding rate 
as measured in 8. To distinguish these two rates of change, I shall temporarily use the 
following notation: 

= (rate of change of vector Q relative to inertial frame S 0 ) 

s 0 

and 

= (rate of change of same vector Q relative to rotating frame 8). 

To compare these two rates of change, I shall expand the vector Q in terms of three 
orthogonal unit vectors e 1? e 2 , and e 3 that are fixed in the rotating frame 8. (For 
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instance, these three units vectors could point along the three solid axes shown in 
Figure 9.8.) Thus, 6 


3 

Q= Qi?i + e 2 e 2 + e 3 e 3 = Qi e ; • (9.27) 


This expansion is chosen for the convenience of observers in the frame 8, since the unit 
vectors are fixed in that frame. Nonetheless, you should recognize that the expansion 
is equally valid in both frames. [Whichever frame we use, (9.27) is just an expansion of 
the one vector Q in terms of three othogonal vectors e 1? e 2 , and e 3 .] The only difference 
is that for observers in 8 the vectors e 2 , e 2 , and e 3 are fixed, but as seen by observers 
in § 0 the vectors e l5 e 2 , and e 3 are rotating. 

Let us now differentiate the expansion (9.27) with respect to time. First, as seen in 
frame 8, the vectors e, are constant, and we get simply 



[Since the expansion coefficients Q t in (9.27) are the same in either frame, we don’t 
need to qualify the derivative on the right with a subscript 8 or S 0 .] 

As seen in frame S 0 , the vectors e ; vary with time. Thus, differentiating (9.27) in 
frame S 0 gives 



(9.29) 


The derivative in the second term on the right is easily evaluated with the help of the 
“useful relation” (9.23). The vector e ( is fixed in the frame 8, which is rotating with 
angular velocity ft relative to S 0 . Therefore the rate of change of e, as seen in S 0 is 
given by (9.23) as 



Qxe,-. 


Thus, we can rewrite the second sum in (9.29) as 

7^ Qi x e,) = fl x 7 Qi e ( - = ftx Q. 


6 Now is one of those moments when, as anticipated in Chapter 1, the notation e,- with i = 1,2,3 
is more convenient for our three unit vectors than x, y, and z. For now, this is just because it lets us 
use the summation sign, £], in sums like (9.27). In Chapter 10, we’ll find that the most convenient 
choice of rotating axes is to use the principal axes of the rotating body, and the notation e ( - works 
very naturally for them. Thus I shall mostly use this notation for the unit vectors fixed in a rotating 
body, and continue to use x, y, and z for the nonrotating frame. 
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Inserting this result into (9.29), and using (9.28) to replace the first sum, we find that 


(£) -(£) + «x Q . 

\ dt /s \ dt /§ 


(9.30) 


This important identity relates the derivative of any one vector Q as measured in the 
inertial frame S 0 to the corresponding derivative in the rotating frame §. In the next 
section, I shall use it to find the form of Newton’s second law in the rotating frame S. 


9.5 Newton’s Second Law in a Rotating Frame 


We are now ready to find the form of Newton’s second law in the rotating frame S. 
To simplify matters, I shall assume that the angular velocity £2 of S relative to S 0 is 
constant, as is the case (to an outstanding approximation) for axes fixed to the earth. 
A rather surprising aspect of the statement that £2 is constant is that if it is true in one 
frame then it is automatically true in the other. This follows immediately from (9.30): 
Since £2 x £2 = 0, the two derivatives of £2 are always the same; in particular, if one 
is zero, so is the other. 

Consider now a particle of mass m and position r. In the inertial frame S 0 , the 
particle obeys Newton’s second law in its normal form, 


m 



(9.31) 


where as usual F denotes the net force on the particle, the vector sum of all forces 
identified in the inertial frame. The derivative on the left is, of course, the derivative 
evaluated by observers in the inertial frame S 0 . However, we can now use Equation 
(9.30) to express this derivative in terms of the derivatives evaluated in the rotating 
frame §. First, according to (9.30) 


Differentiating a second time, we find 

Applying (9.30) to the outside derivative on the right, we find 

(Sb»:©,[©, + H <“> 
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This rather messy result can be cleaned up. First, since our main concern is going to 
be with derivatives evaluated in the rotating frame S, we’ll revive the “dot” notation 
for these derivatives. That is, I shall use Q to denote 



the derivative of any vector Q in the rotating frame §. If we next note that, since ft is 
constant, its derivative is zero, and we group together two like terms, we can rewrite 
(9.32) as 



= r + 2ftxr + ftx(ftxr) 


(9.33) 


where the dots on the right all indicate derivatives evaluated with respect to the rotating 
frame S. 

If we now substitute the result (9.33) into Newton’s second law (9.31) for the 
inertial frame S 0 and move two terms to the right, we find the form of Newton’s 
second law for the rotating frame S to be 


mr = F + 2mr xfi + m(U x r) x ft, (9.34) 


where, as usual, F denotes the sum of all the forces as identified in any inertial frame. 
As with the accelerated frame of Section 9.1, we see that the equation of motion in a 
rotating reference frame looks just like Newton’s second law, except that in this case 
there are two extra terms on the force side of the equation. The first of these extra terms 
is called the Coriolis force (after the French physicist G.G.de Coriolis, 1792-1843, 
who was the first to explain it), 


F cor = 2mr x ft. 

(9.35) 

The second is the so-called centrifugal force 



F cf = m(ft x r) x ft. (9.36) 


I shall discuss these two terms in the next few sections. For now, the important point 
is that we can go ahead and use Newton’s second law in rotating (and hence nonin- 
ertial) reference frames, provided we remember always to add these two “fictitious” 
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inertial forces to the net force F calculated for an inertial frame. That is, in a rotating 
frame, 7 


tnt = F + F cor + F cf . 


(9.37) 


9.6 The Centrifugal Force 


We have just seen that in order to use Newton’s second law in a rotating frame (such 
as a frame attached to the earth) we must introduce two inertial forces, the centrifugal 
and the Coriolis forces. To some extent, we can examine the two forces separately. In 
particular, the Coriolis force on an object is proportional to the object’s velocity v = r 
relative to the rotating frame. Therefore, the Coriolis force is zero for any object that is 
at rest in the rotating frame, and it is negligible for objects that are moving sufficiently 
slowly. For the rest of this chapter, our main concern will be with the rotating frame of 
the earth, for which we can easily estimate the relative importance of the two inertial 
forces. Because both forces involve vector products, they depend on the directions of 
the various vectors, but for an order-of-magnitude estimate we can take 

F cor ~ mvQ and F cf ~mrQ 2 , 

where v is the object’s speed relative to the rotating frame of the earth, that is, the 
speed as observed by us on the earth’s surface. Therefore 



Here, in the middle expression, I have canceled a common factor of mQ and I have 
replaced r by the earth’s radius R. (Remember that the origin is at the earth’s center, so 
for objects near the earth’s surface, r « R.) In the last expression I have replaced RQ. 
by V , the speed of a point on the equator as the earth rotates with angular velocity £2. 
Since V is approximately 1000 mi/h, (9.38) shows that for projectiles with v <£ 1000 
mi/h it will be a good starting approximation to ignore the Coriolis force, and this is 
what I shall do in this section. 8 

The centrifugal force is given by (9.36) as 

F cf = m(Q x r) x ft. (9.39) 


7 Our derivation of this important result hinged crucially on the relation (9.30) between time 
derivatives of a given vector in the rotating and nonrotating frames. If you find that relation confusing, 
you might prefer the alternative derivation based on the Lagrangian formalism and outlined in 
Problem 9.11. 

8 As we shall see later, even when v <& 1000 mi/h, the Coriolis force can have appreciable 
effects (for instance, with the Foucault pendulum). Nevertheless, it is certainly true that F cor is 
small compared to F cf , and it makes sense to ignore the former at first. 
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Figure 9.9 The vector ft x r is the velocity of an object as it is 
dragged eastward with speed ftp by the earth’s rotation. Therefore, 
the centrifugal force, ra(ft x r) x ft, points radially outward from 
the axis and has magnitude raft 2 p- 


We can see what this looks like with the help of Figure 9.9, which shows an object on 
or near the earth’s surface at colatitude 0. The earth’s rotation carries the object around 
a circle of latitude, and the vector ft x r (which is just the velocity of this circular 
motion as seen from a nonrotating frame) is tangent to this circle. Thus (ft x r) x ft 
points radially outward from the axis of rotation in the direction of p, the unit vector 
in the p direction of cylindrical polar coordinates. The magnitude of (ft x r) x ft is 
easily seen to be Q, 2 r sin# = ft 2 p. Thus, 

F cf = mft 2 p p. (9.40) 

To summarize, from the point of view of observers rotating with the earth, there 
is a centrifugal force that is radially outward from the earth’s axis and has magnitude 
mft 2 p. If we momentarily let v = ft x r denote the velocity associated with the earth’s 
rotation (observed from a nonrotating frame), then its magnitude is v — ftp, and the 
centrifugal force takes the familiar form mv 2 /p. 

Free-Fall Acceleration 

The free-fall acceleration that we call g is the initial acceleration, relative to the earth, 
of an object that is released from rest in a vacuum near the earth’s surface. We can 
now see that this is actually a surprisingly complicated notion. The equation of motion 
(relative to the earth) is 9 

mr = F grav + F cf , (9.41) 


9 1 have defined g as the initial acceleration of a body released from rest to ensure that the Coriolis 
force is zero. When the object speeds up, we shall find that the Coriolis force eventually becomes 
important and the acceleration changes (although the effect is usually very small). 
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where F cf is given by (9.40) and F grav is the gravitational force 


_ GMm „ 

Fgrav =- -jp-r = mg 0 . 


(9.42) 


Here, M and R are the mass and radius of the earth, and f denotes the unit vector that 
points radially out from O, the center of the earth. 10 The acceleration g 0 is defined by 
the second equality and could be called the “true” acceleration of gravity, inasmuch 
as it is the acceleration we would observe if there were no centrifugal effect. 

We see from (9.41) that the initial acceleration of a freely falling object is deter¬ 
mined by an effective force which is equal to the sum of two terms, 


F eff = F grav + F cf = mg 0 + mti 2 R sin 9 p (9.43) 


where the last expression for F cf comes from (9.40) with p replaced by R sin#. The 
two forces that make up the effective force are shown in Figure 9.10, from which it 
is clear that the free-fall acceleration is in general not equal to the true gravitational 
acceleration, either in magnitude or direction. Specifically, dividing (9.43) by m, we 
find for the free-fall acceleration g, 


g = g 0 + n 2 R sin 9p. (9.44) 

The component of g in the inward radial direction (the direction of —r) 11 is 

grad = g 0 ~ Q2R sin2 (9-45) 


The second, centrifugal term is zero at the poles (9 = 0 or n) and is largest at the 
equator, where its magnitude is easily found [using the value of £2 from (9.26)] 
to be 


Q 2 R m (7.3 x 10“ 5 s -1 ) 2 x (6.4 x 10 6 m) W 0.034 m/s 2 . (9.46) 

Since g 0 is about 9.8 m/s 2 , we see that, because of the centrifugal force, the value 
of g at the equator is about 0.3% less than at the poles. 12 Although this difference is 
certainly small, it is easily measured with modem gravimeters that can measure g to 
about 1 part in 10 9 . 


10 In claiming that the gravitational force is —(GMm/r 2 ) r, I am assuming that the earth is 
perfectly spherically symmetric, which, although a very good approximation, is not exactly true. 
Fortunately, all that matters is that F grav is certainly proportional to m, so it can always be written as 
mg 0 . For almost all purposes we can say that g 0 is in the direction of —r, and it is certainly extremely 
close to that. 

11 Strictly speaking this is the component of g in the direction of go not —r. By the same token, 
the factor sin 2 8 should actually be sin 8 sin 8' where 8' is the angle between the line of g D and north 
(as opposed to 8, the angle between r and north). However, the difference (which is only because the 
earth is not perfectly spherically symmetric) is, for mostpractical purposes, completely negligible. 

12 The actual difference is more like 0.5%, the additional 0.2% being the result of the earth’s 
bulge at the equator. 
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Figure 9.10 The free-fall acceleration (relative to the earth) of 
an object released from rest close to the earth’s surface is the 
result of two force terms, the true gravitational force mg 0 and 
the centrifugal, inertial force F cf , which points out from the axis 
of rotation. (The size of the centrifugal term is much exaggerated 
in this picture.) 


You can see from (9.44) and Figure 9.10 that the tangential component of g (the 
component normal to the true gravitational force) comes entirely from the centrifugal 
force and is 


£tang = s i n 0 cos 9. (9.47) 

This tangential component of g is zero at the poles and at the equator, and is maximum 
at latitude 45°. The most striking feature of a nonzero value for g tang is that it means 
the free-fall acceleration is not exactly in the direction of the true gravitational force. 
As you can see from Figure 9.11, the angle between g and the radial direction is about 
a ~ gtang/grad’ anc * its maximum value (at 0 = 45°) is 

“max = — S3 ° 034 ^ 0-0017 rad ^ 0.1°. (9.48) 

max 2g 0 2 x 9.8 

This angle a is the angle between the observed free-fall acceleration g and the true 
acceleration of gravity g 0 — what we might be tempted to call “vertical.” The value 
of a is actually rather difficult to measure. The direction of the observed g is easy 
(in principle, at least). To find the direction of g 0 , you might hope to use a plumb 
line, but a moment’s thought should convince you that a plumb line is also subject to 
the centrifugal force and will hang in the direction of g, not that of g 0 . In fact, any 
attempt to find the direction of g 0 simply and directly winds up finding the direction 
of g. For this reason, in what follows I shall define “vertical” as the direction of a 
plumb line. Therefore, on those rare occasions when these tiny distinctions matter, 
“vertical” will mean “in the direction of ±g.” By the same token, “horizontal” will 
mean “perpendicular to g.” 
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Figure 9.11 Because of the centrifugal force, the free-fall accel¬ 
eration g has a nonzero tangential component (greatly exagger¬ 
ated here) and g deviates from the radial direction by the small 
angle a. 


9.7 The Coriolis Force 


When an object is moving, there is a second inertial force that you must include when 
you want to use Newton’s second law in a rotating frame. This is the Coriolis force 
(9.35) 


F cor = 2mr x £2 = 2mv x £2 (9.49) 

where v = r is the object’s velocity relative to the rotating frame. There is a remarkable 
parallel between the Coriolis force and the well-known force ^vxBona charge q 
in a magnetic field B. Indeed, if we replace 2m by q and £2 by B, the former becomes 
exactly the latter. Although this parallel has no deep significance, it can often be a 
help in visualizing how the Coriolis force is going to affect a particle’s motion. 

The magnitude of the Coriolis force depends on the magnitudes of v and £2 as well 
as their relative orientations. For the case that the rotating frame is the earth, we have 
seen in (9.26) that £2 ^ 7.3 x 10 -5 s -1 . For an object with v « 50 m/s (a fast baseball, 
for example), the maximum acceleration the Coriolis force could produce (acting by 
itself and with v perpendicular to £2) would be 

a max = 2u£2 « 2 x (50 m/s) x (7.3 x 10~ 5 s' 1 ) & 0.007 m/s 2 . 

Compared to the free-fall acceleration g = 9.8 m/s 2 , this is very small, though cer¬ 
tainly detectable if we were to take the trouble. Some projectiles, such as rockets and 
long-range shells, travel much faster than 50 m/s, and for them the Coriolis force is 
correspondingly more important. In addition, we shall see that there are systems, such 
as the Foucault pendulum, where the Coriolis force, though very small, can act for a 
long time and hence produce a large effect. 
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Direction of the Coriolis Force 

Like the magnetic force q\ x B, the Coriolis force 2mv x ft is always perpendic¬ 
ular to the velocity of the moving object, with its direction given by the right-hand 
rule. Figure 9.12 is an overhead view of a horizontal turntable that is rotating counter¬ 
clockwise relative to the ground. The angular velocity ft points vertically up (out of 
the page in the figure). If we consider an object sliding or rolling across the turntable, 
it is easy to see that, whatever the object’s position and velocity, the Coriolis force 
tends to deflect the velocity to the right. Similarly, if the turntable were rotating clock¬ 
wise, the Coriolis deflection would always be to the left. (Whether the object actually 
is deflected in the specified directions depends on what other forces are acting and 
how big they are.) 

We could imagine Figure 9.12 to be the Northern Hemisphere viewed from above 
the North Pole. (Since the earth rotates to the east, the angular velocity is directed 
as shown.) Thus we reach the conclusion that the Coriolis effect due to the earth’s 
rotation tends to deflect moving bodies to the right in the Northern Hemisphere (and, 
of course, to the left in the Southern Hemisphere). 13 This effect is important to long- 
range gunners, who must aim to the left of their target in the Northern, and to the 
right in the Southern, Hemisphere. (See Problem 9.28.) An important example from 



Figure 9.12 Overhead view of a horizontal turntable that is rotating 
counterclockwise relative to an inertial frame. The turntable’s an¬ 
gular velocity ft points up out of the picture. As seen by observers 
on the turntable, the two objects sliding on the table are subject to 
Coriolis forces F cor = 2mv x ft. Irrespective of the bodies’ posi¬ 
tions and velocities, the Coriolis force always tends to deflect the 
velocity to the right. (If the direction of rotation were clockwise, 
then ft would be into the page and the Coriolis force would tend to 
deflect moving objects to the left.) 


13 Because the earth is three-dimensional (as opposed to a turntable, which is two-dimensional) 
the Coriolis effect is actually a little more complicated than this simple statement suggests. However, 
the statement above is certainly correct for objects moving parallel to the earth’s surface and for low 
trajectory projectiles. 
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Figure 9.13 A cyclone is the result of air moving into a low- 
pressure region and being deflected to the right (in the Northern 
Hemisphere) by the Coriolis effect. This causes a counterclock¬ 
wise flow with the inward pressure force balancing the outward 
Coriolis force. 


meteorology is the phenomenon of cyclones. These occur when the air surrounding 
a region of low pressure moves rapidly inward. Because of the Coriolis effect, the air 
is deflected to the right, as shown in Figure 9.13, and therefore begins to circulate 
counterclockwise (in the Northern Hemisphere — clockwise in the Southern). When 
this happens sufficiently violently, the result is a storm, known variously as a cyclone, 
hurricane, or typhoon. 

It is important to bear in mind that both the Coriolis and centrifugal forces are 
at root kinematic effects, resulting from our insistence on using a rotating frame of 
reference. In a few simple cases, it is actually easier (as well as instructive) to analyze 
the motion in an inertial frame and then transform the results to the rotating frame, as 
the following example illustrates. Nevertheless, the transformation between the two 
frames is usually so complicated that it is easier to work all the time in the rotating 
frame and to live with the “fictitious” Coriolis and centrifugal forces. 


example 9.2 Simple Motion on a Turntable 

j Three observers A, B, and C are standing on a horizontal turntable with A at 
j the center, C at the edge and B halfway between, as shown in Figure 9.14(a). 
j The turntable is rotating counterclockwise (as seen from above) with angular 
{ velocity Q. At time t — 0, A kicks a frictionless puck exactly toward B and C, 
but to his surprise the puck misses both B and C, the latter by an even bigger 
margin than the former. Explain these events from the points of view both of the 
j observers on the rotating table and of an observer on the ground. 

The net force on the puck (as identified in any inertial frame) is zero. Thus 
j in the rotating frame, the only two forces are the centrifugal and Coriolis 
I forces. The former is always radially outward and has no bearing on the puck’s 
j deflection. The latter consistently deflects the puck’s velocity to the right, just 
like an upward magnetic field acting on a positive charge. This causes the puck 
| to follow the curved path shown in Figure 9.14(a). At time t { when it reaches 
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(a) As seen in rotating frame 



Figure 9.14 (a) Three observers A, B, and C are in a line on a rotating turntable, 

with A at the center and C at the perimeter. Observer A kicks a puck toward B and 
C, but because of the Coriolis force the puck veers to the right and misses B and 
C. (b) The same experiment as seen by an observer on the ground. In this frame, 
the puck travels in a straight line, but by the time when it reaches the radius of 
B, observer B has moved to the left. By the time t 2 when the puck reaches the 
perimeter, C has moved even further to the left. 


the radius of B, it is a small distance to the right of B, and at time t 2 when it 
arrives at the turntable’s edge, it is about twice as far to the right of C. This 
explanation is correct and clear, but depends on our understanding the Coriolis 
force. By analyzing the same experiment in the ground-based frame, we can 
gain an additional understanding of why the deflection occurs. 

In the inertial frame of an observer on the ground, the net force on the puck is 
zero, and the puck follows a straight path, as shown in Figure 9.14(b). However, 
by the time t h when it should have been hitting B, observer B has moved to 
the left as the turntable rotates. By time t 2 , when it should have been hitting C, 
observer C has moved even further to the left. As seen by the puck, B and C 
have moved to the left. Therefore, as seen by B and C, the puck has curved to 
the right as in Figure 9.14(a). 

This simple alternative explanation of the Coriolis effect is actually decep¬ 
tively simple. In general, the effects of the Coriolis and centrifugal forces are 
surprisingly complicated and are not nearly so easy to explain by reference to 
the nonrotating frame. (See Problems 9.20 and 9.24.) 


9.8 Free Fall and the Coriolis Force 


Let us next consider the effect of the Coriolis force on a freely falling object, that is, an 
object falling in a vacuum, close to a point R on the earth’s surface. For this analysis, 
we must also include the centrifugal force, so the equation of motion is 


mr = mg 0 + F cf + F cor 


(9.50) 
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where, as before, mg 0 denotes the true force of the earth’s gravity on the object. The 
centrifugal force is m(Q x r) x SI, (where r is the object’s position relative to the 
center of the earth), but to an outstanding approximation we can replace r by R (the 
position on the earth’s surface where the experiment is being conducted). Thus, 

F cf = m(Sl x R) x ft . 

Returning to the equation of motion (9.50), you will recognize that the sum of the first 
two terms on the right is just mg, where g is the observed free-fall acceleration for 
an object released from rest at position R, as introduced in (9.44). In other words, we 
can omit the term F cf from (9.50), if we replace g 0 by the observed g at the location of 
our experiment. If we substitute 2mv x SI for F cor , the equation of motion becomes 
(after cancellation of a factor of m) 

r = g + 2rxS2. (9.51) 

A simplifying feature of the equation (9.51) is that it does not involve the position 
r at all (only its derivatives r and r). This means the equation will not change if we 
make a change of origin (since a change of origin amounts to adding a constant to r). 
Accordingly, I shall now choose my origin on the surface of the earth at the position 
R, as shown in Figure 9.15. With this choice of axes, we can resolve the equation of 
motion into its three components. The components of r and SI are 

r = (i,y,z) 



Figure 9.15 Choice of axes for a free-fall experiment. The 
origin O is on the earth’s surface at the experiment’s location 
(position R relative to the center of the earth). The z axis 
points vertically up (more precisely, in the direction of —g, 
where g is the observed free-fall acceleration), the x and 
y axes are horizontal (that is, perpendicular to g), with y 
pointing north, and x due east. The position of the falling 
object relative to O is r. 
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and 


£2 = (0, £"2sin#, £2cos#). 


Thus, those of r x £2 are 

r x £2 = (y£2cos# — z£2sin#, —i£2cos#, i£2sin#) (9.52) 

and the equation of motion (9.51) resolves into the following three equations: 
x = 2£2 (y cos # — z sin #) 

y — — 2£2icos# (9.53) 

z = — g + 2£2jc sin#. 

We shall solve these three equations by making a succession of approximations 
that depend on the smallness of £2. First, because £2 is very small, we get a reasonable 
starting approximation if we ignore £2 entirely. In this approximation, the equations 
reduce to 


x = 0, y — 0, and z = — g, (9.54) 

which are the equations of free fall solved in every introductory physics course. If the 
object is dropped from rest at x = y = 0 and z = h, then the first two equations imply 
that i, y, x, and y all remain zero, while the last equation implies that z = —gt and 
z = h — \gt 2 . Thus our approximate solution is 

x — 0, y = 0, and z = h — \gt 2 , (9.55) 

that is, the object falls vertically down with constant acceleration g. This approxima¬ 
tion is sometimes called the zeroth-order approximation because it involves only the 
zeroth power of £2 (that is, it is independent of £2). It is well known to be a very good 
approximation, but it shows none of the effects of the Coriolis force. 

To get the next approximation, we argue as follows: The terms in (9.53) that involve 
£2 are all small. Thus, it will be safe to evaluate these terms using our zeroth-order 
approximation for x, y, and z. Substituting (9.55) into the right side of (9.53), we get 

x = 2£2 gt sin#, y = 0, and z = -g. (9.56) 

The last two of these are exactly the same as in zeroth order, but the equation for 
x is new and is easily integrated twice to give 

x = |£2gt 3 sin#, (9.57) 

with y and z the same as in the zeroth approximation (9.55). This result is naturally 
called the first-order approximation (being good through the first power of £2). We 
can repeat this process again to get a second-order approximation and so on, but the 
first-order is good enough for our purposes. 

The striking thing about the solution (9.57) is that a freely falling object does not 
fall straight down. Instead the Coriolis force causes it to curve slightly to the east 
(positive x direction). To get an idea of the magnitude of the effect, consider an object 
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dropped down a 100-meter-deep mine shaft at the equator, and let us find the total 
deflection by the time it hits the bottom. The time to reach the bottom is determined 
by the last of equations (9.55) as t = J2h/g, and (9.57) gives for the total easterly 
deflection (putting 9 = 90° and g M 10 m/s 2 ) 



# ^ x (7.3 x 1(T 5 s' 1 ) x (10 m/s 2 ) x (20 s 2 ) 3/2 « 2.2 cm 


a small deflection, but certainly detectable. A small easterly deflection of this type 
was actually predicted by Newton and verified by his rival Robert Hooke (of Hooke’s 
law fame, 1635-1703), although it was not properly explained until the Coriolis effect 
was understood. 


9.9 The Foucault Pendulum 


As a final and striking application of the Coriolis effect, let us consider the Foucault 
pendulum, which can be seen in many science museums around the world and is 
named for its inventor, the French physicist Jean Foucault (1819-1868). This is a 
pendulum made of a very heavy mass m suspended by a light wire from a tall ceiling. 
This arrangement allows the pendulum to swing freely for a very long time and to 
move in both the east-west and north-south directions. As seen in an inertial frame, 
there are just two forces on the bob, the tension T in the wire and the weight mg 0 . In 
the rotating frame of the earth, there are also the centrifugal and Coriolis forces, so 
the equation of motion in the earth’s frame is 

mr = T + mg 0 + m(ft x r) x + 2mr x ft. 

Exactly as in the previous section, the second and third terms on the right combine 
to give mg, where g is the observed free-fall acceleration, and the equation of motion 
becomes 

mr = T + mg + 2mx x ft. (9.58) 

We can now choose our axes as in the previous section, so that jc is east, y is north, 
and z vertically up (direction of —g), and the pendulum is as shown in Figure 9.16. 

I shall restrict our discussion to the case of small oscillations, so that the angle 
between the pendulum and the vertical is always small. This allows two simplifying 
approximations: First, the z component of the tension T is well approximated by the 
magnitude; that is, T Z = T cos /3 ^ T. Second, it is not hard to see that, for small 
oscillations, T z ~ mg. 14 Putting these two approximations together, we can write 

T ~ mg. (9.59) 


14 Look at the z component of (9.58). In the limit of small oscillations, the term on the left and 
the last term on the right both approach zero, and you’re left with T z — mg = 0. 
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Figure 9.16 A Foucault pendulum comprises a bob of mass m 
suspended by a light wire of length L from the point P on a high 
ceiling. The tension force on the bob is shown as T and its x and 
y components are T x and T y . For small oscillations the angle j3 
is very small. 


We now need to examine the x and y components of the equation of motion (9.58). 
This requires that we identify the x and y components of T. If you look at Figure 9.16, 
you will see that, by similar triangles, T x /T = —x/L and similarly for T y . Combining 
this with (9.59), we find that 

T x = —mgx/L and T y — —mgy/L. (9.60) 

The x and y components of g are, of course, zero, and the components off x fl are 
given in (9.52). Putting all of these into (9.58), we find (after canceling a factor of 
m and dropping a term involving z, which is negligible compared to x or y for small 
oscillations) 


I = ~gx/L + 2yQ cos 6 1 

y = -gy/L - IJcQcosd. J 1 ' ; 

where as usual 0 denotes the colatitude of the location of the experiment. The factor 
g/L is just co o 2 , where a> 0 is the natural frequency of the pendulum, and Q cos 0 is 
just £2 Z , the z component of the earth’s angular velocity. Thus these two equations of 
motion can be rewritten as 


x — 2Q, z y + (o^x = 0 ] 
y + 2Q z x + co^y = 0. j 


(9.62) 


We can solve the coupled equations (9.62) using the trick, introduced in Chapter 
2, of defining a complex number 


r] = x + iy. 
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Recall that not only does this complex number contain the same information as the 
position in the xy plane, but a plot of r\ in the complex plane is an actual bird’s eye 
view of the pendulum’s projected position (x,y). If we multiply the second equation 
of (9.62) by i and add it to the first, we get the single differential equation 

ij + 2iQ z f) + io 2 r\ = 0. (9.63) 

This is a second-order, linear, homogeneous differential equation and so has exactly 
two independent solutions. Thus if we can find two independent solutions, we shall 
know that the most general solution is a linear combination of these two. As often 
happens, we can find two independent solutions by inspired guesswork: We guess 
that there is a solution of the form 


rj(t) = e ~ iat (9.64) 

for some constant a. Substituting this guess into (9.63), we see immediately that it is 
a solution if and only if a satisfies 

a 2 — 2Q, z a — co 2 = 0 

or 

a = Q, z ± yj$l 2 + u>l 

^Q z ±(o 0 (9.65) 

where the last line is an extremely good approximation since the earth’s angular 
velocity Q is so very much smaller than the pendulum’s co 0 . This gives us the required 
two independent solutions, and the general solution to the equation of motion (9.63) 
is 

n = . (9.66) 

To see what this solution looks like, we need to fix the two constants C 1 and C 2 by 
specifying the initial conditions. Let us suppose that at t = 0 the pendulum has been 
pulled aside in the x direction (east) to a position x = A and y = 0, and is released 
from rest (v x0 = v yo = 0). With these initial conditions, you can easily check that 15 
C l — C 2 = A/ 2, and our solution becomes 

r\(t) = x(t) + iy(t) = Ae~ l ^ zt cos co 0 t. (9.67) 

At t = 0 the complex exponential is equal to one, and x = A, while y = 0. Because 
Q z <<C co 0 , the cosine factor in (9.67) makes many oscillations before the exponential 
changes appreciably from one. This implies that, initially, x(t) oscillates with angular 
frequency co 0 between ±A, while y remains close to zero. That is, initially, the 
pendulum swings in simple harmonic motion along the x axis, as indicated in Figure 
9.17(a). 


15 Actually, there is a small subtlety, in that these simple values depend on the (true) assumption 
that <4 oj 0 , as you will see when you check them. 
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Figure 9.17 Overhead views of the motion of a Foucault pendulum, (a) For 
a while after being released, the pendulum swings back and forth along the 
x axis, with amplitude A and frequency a> 0 . (b) As time advances, the plane 
of its oscillations slowly rotates with angular velocity equal to £2 Z , the z 
component of the earth’s angular velocity. 


However, eventually the complex exponential e~ lQzt begins to change, causing 
the complex number r] = x + iy to rotate through an angle Q z t. In the Northern 
Hemisphere, where Q z is positive, this means that the number x + iy continues 
to oscillate sinusoidally (due to the factor cos co 0 t), but in a direction that rotates 
clockwise. That is, the plane in which the pendulum is swinging rotates slowly 
clockwise, with angular velocity £2 Z , as indicated in Figure 9.17(b). In the Southern 
Hemisphere, where Q z is negative, the corresponding rotation is counterclockwise. 

If the Foucault pendulum is located at colatitude 0 (latitude 90° — 9), then the rate 
at which its plane of oscillation rotates is 

£2 z = £2cos0. (9.68) 

At the North Pole (6 = 0), Q Z = Q and the rate of rotation of the pendulum is the same 
as the earth’s angular velocity. This result is easy to understand: As seen in an inertial 
(nonrotating) frame, a Foucault pendulum at the North Pole would obviously swing 
in a fixed plane; meanwhile, as seen in the same inertial frame, the earth is rotating 
counterclockwise (as seen from above) with angular velocity Q. Clearly then, as seen 
from the earth, the pendulum’s plane of oscillation has to be rotating clockwise with 
angular velocity £T 

At any other latitude, the result is much more complicated from an inertial point of 
view, but the rate of rotation of the Foucault pendulum is easily calculated from (9.68). 
At the equator (9 = 90°), £2 Z = 0 and the pendulum does not rotate. At a latitude 
around 42° (the approximate latitude of Boston, Chicago, or Rome), 

£2 Z = £2cos48° as 

Since £2 equals 3607day, |£2 = 240°/day, and we see that in the course of 6 hours (a 
time for which a long, well-built pendulum will certainly continue to swing without 
significant damping), the pendulum’s plane of motion will rotate through 60° — an 
easily observable effect. 
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9.10 Coriolis Force and Coriolis Acceleration 


Recall that in Equation (1.48) of Chapter 1 we found the form of Newton’s second 
law in two-dimensional polar coordinates, 


F = mr 


\ F r = m(r — rft 2 ) 
l F 0 = m(r0 + 2 rft). 


(9.69) 


We can now understand the rather ugly last term in each of the two equations on the 
right in terms of the centrifugal and Coriolis forces. 

Consider a particle that is subject to a net force F and moves in two dimensions. 
(The exact same analysis works in three dimensions using cylindrical polar coordi¬ 
nates, but for simplicity I shall work in two dimensions.) Relative to any inertial frame 
S with origin O, the particle must satisfy (9.69). Now consider a noninertial frame 
S' which shares the same origin O and is rotating at constant angular velocity £T2, 
chosen so that £2 = <p at one chosen time t = t 0 . That is, at the chosen instant t 0 , the 
frame S' and the particle are rotating at the same rate. (For this reason, the frame S' is 
sometimes called the co-rotating frame.) If the particle has polar coordinates (r', ft) 
relative to S', then at all times 


r = r 

(since S and S' share the same origin), and at the time t 0 

ft = 0 

since the frame S' and the particle are rotating at the same rate at t = t Q . Newton’s 
second law can be applied in the frame S', provided we include the centrifugal and 
Coriolis forces. Thus 


F + F cf + F cor = mr'. 


(9.70) 


Let us write this equation in polar coordinates: The centrifugal force F cf is purely 
radial, with radial component rQ 2 . (Remember that r' m r, so it makes no difference 
whether we write r or r'.) The Coriolis force F cor is 2 mV x ft, and, since v' is 
purely radial in the co-rotating frame, F cor is in the ft direction with ft component 
—2mrft. Finally the term mr' on the right of (9.70) can be replaced by the analog 
of (9.69), except that in the co-rotating frame <p — 0 (at the chosen time t 0 ), so the 
terms containing <p will be absent. Putting all this together, we find for the equation 
of motion of the particle in the co-rotating frame, 


F + F cf + F cor = mr' 


[ F r + mrQ 2 = mr 
[ F^ — 2mrQ = mrf. 


(9.71) 


(Because the frame S' is rotating at a constant rate, I could replace ft by f since they 
are equal.) 
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Let us now compare the equation of motion (9.69) for the inertial frame, with 
(9.71) for the co-rotating frame. The most important thing to recognize is that, because 
Q = (t), they are exactly the same equations for r and <p, although certain terms are 
distributed differently between the two sides. In (9.69) for the nonrotating frame, the 
only force terms on the left are the real net force, with components F r and F^. On 
the right of (9.69), the acceleration contains the centripetal acceleration — r<p 2 in its 
radial component and the Coriolis acceleration 2r<p in its 4> component. In (9.71) for 
the rotating frame, neither of these additional acceleration terms is present (because 
we arranged that 4>' is zero), but instead they are reincarnated on the force side of the 
equations (with opposite signs, of course) as the centrifugal force mQ 2 r in the radial 
equation and Coriolis force —ImrFl in the 0 equation. 

Since the two versions of the equations are the same, it is clear that they are equally 
correct. In the inertial frame, the forces are simpler (no “fictitious” forces) but the 
accelerations are more complicated; in the rotating frame, it is the other way round. 
Which frame one chooses to use is dictated by convenience. In particular, when the 
observer is anchored to a rotating frame (as we earthlings are), it is generally more 
convenient to work in the rotating frame and to learn to live with the “fictitious” 
centrifugal and Coriolis forces. 


Principal Definitions and Equations of Chapter 9 _ 

Inertial Force in an Accelerating but Nonrotating Frame 

The motion of a body, as seen in a frame that has acceleration A relative to an inertial 
frame, can be found using Newton’s second law in the form mr — F 4- F inertial , where 
F is the net force on the body (as measured in any inertial frame) and F inertial is an 
additional inertial force 

Finertial = ~mA. [Eq. (9.5)] 

The Angular Velocity Vector 

If a body is rotating about an axis specified by the unit vector u (direction given by 
the right-hand rule) at a rate co (usually measured in radians per second), its angular 
velocity vector is defined as 

co - con. [Eq. (9.21)] 


The “Useful Relation” 

The velocity of a point r fixed in a rigid body that is rotating with angular velocity 
(o is 


v = co x r. 


[Eq. (9.22)] 
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Time Derivatives in a Rotating Frame 

If frame § has angular velocity ft relative to frame S 0 , then the time derivatives of 
single vector Q as seen in the two frames are related by 



+ ft x Q. 


[Eq. (9.30)] 


Newton’s Second Law in a Rotating Frame 

If frame § has angular velocity ft relative to an inertial frame S 0 , then Newton’s second 
law in the rotating frame takes the form 


mr = F + F cor + F cf , [Eq. (9.37)] 

where F is the net force on the body (as measured in any inertial frame) and the inertial 
forces F cor and F cf are the Coriolis and centrifugal forces, 

F CO r = 2mr x ft and F cf = m(ft x r) x ft. [Eqs. (9.35) & (9.36)] 

Free-Fall Acceleration 

The observed free-fall acceleration g (defined as the initial acceleration, relative to 
the earth, from rest) includes the “true” gravitational acceleration g 0 and the effect of 
the centrifugal force 


g = g 0 + (ft x R) x ft. [Eq. (9.44)] 

“Vertical” is defined as the direction of g, and “horizontal” as perpendicular to g. 


Problems for Chapter 9 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (★★★). 

section 9.1 Acceleration without Rotation 

9.1 * Be sure you understand why a pendulum in equilibrium hanging in a car that is accelerating 
forward tilts backward, and then consider the following: A helium balloon is anchored by a massless 
string to the floor of a car that is accelerating forward with acceleration A. Explain clearly why the 
balloon tends to tilt forward and find its angle of tilt in equilibrium. [Hint: Helium balloons float 
because of the buoyant Archimedean force, which results from a pressure gradient in the air. What is 
the relation between the directions of the gravitational field and the buoyant force?] 

9.2 ★ A donut-shaped space station (outer radius R ) arranges for artificial gravity by spinning on the 
axis of the donut with angular velocity co. Sketch the forces on, and accelerations of, an astronaut 
standing in the station (a) as seen from an inertial frame outside the station and (b) as seen in the 
astronaut’s personal rest frame (which has a centripetal acceleration A = co 2 R as seen in the inertial 
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frame). What angular velocity is needed if R = 40 meters and the apparent gravity is to equal the usual 
value of about 10 m/s 2 ? (c) What is the percentage difference between the perceived g at a six-foot 
astronaut’s feet (R = 40 m) and at his head (R = 38 m)? 

section 9.2 The Tides 

9.3** (a) Consider the tidal force (9.12) on a mass m at the position P of Figure 9.4. Write d as 
(d a — R e ) = d 0 (l — R e /d 0 ) and use the binomial approximation (1 — e)~ 2 1 + 2e to show that 

F tid ~ —(2 GM m mR e /d 0 3 )\. Confirm the direction of the force shown in Figure 9.4 and make a 
numerical comparison of the tidal force with the gravitational force mg of the earth, (b) Do the 
corresponding calculations for the force at the point R. Compare this force with that of part (a) 
(magnitude and direction). 

9.4 ★* Do the same calculations as in Problem 9.3(a) but for the tidal force at the point Q in Figure 9.4. 
[In this case write d/d 2 = d/d 3 and use the binomial approximation in the form (1 + e)~ 3 ~ 1 — 3e.] 

9.5 ** Review the derivation of the tidal potential energy (9.16) of a drop of water at the point Q in 
Figure 9.5 and then give in detail the derivation of (9.17) for the tidal PE at the point P. 

9.6 *** Let h(9) denote the height of the ocean at any point T on the surface, where h(0) is measured 
up from the level at the point Q of Figure 9.5 and 9 is the polar angle TOR of T. Given that the surface 
of the ocean is an equipotential, show that h(0) = h 0 cos 2 9, where h Q = 3 M m R*/(2 M e d 3 ). Sketch 
and describe the shape of the ocean’s surface, bearing in mind that h 0 <3C R e . [Hint: You will need to 
evaluate U tld (T) as given by (9.13), with d equal to the distance MT. To do this you need to find d by 
the law of cosines and then approximate d~ l using the binomial approximation, being very careful to 
keep all terms through order (R e /d 0 ) 2 . Neglect any effects of the sun.] 

section 9.4 Time Derivatives in a Rotating Frame 

9.7 ★ (a) Explain the relation (9.30) between the derivatives of a vector Q in two frames S 0 and 8 for 
the special case that Q is fixed in the frame S. (b) Do the same for a vector Q that is fixed in the frame 
S 0 and compare with your answer to part (a). 

section 9.5 Newton’s Second Law in a Rotating Frame 

9.8 * What are the directions of the centrifugal and Coriolis forces on a person moving (a) south near 
the North Pole, (b) east on the equator, and (c) south across the equator? 

9.9 * A bullet of mass m is fired with muzzle speed v 0 horizontally and due north from a position at 
colatitude 9. Find the direction and magnitude of the Coriolis force in terms of m,v a ,9, and the earth’s 
angular velocity S2. How does the Coriolis force compare with the bullet’s weight if v 0 = 1000 m/s and 
9 =40 deg? 

9.10 ** The derivation of the equation of motion (9.34) for a rotating frame made the assumption 
that the angular velocity SI was constant. Show that if ft f 0 then there is a third “fictitious force,” 
sometimes called the azimuthal force, on the right side of (9.34) equal to mr x SI. 

9.11 *** In this problem you will prove the equation of motion (9.34) for a rotating frame using the 
Lagrangian approach. As usual, the Lagrangian method is in many ways easier than the Newtonian 
(except that it calls for some slightly tricky vector gymnastics), but is perhaps less insightful. Let 8 be 
a noninertial frame rotating with constant angular velocity S2 relative to the inertial frame S 0 . Let both 
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frames have the same origin, 0 = 0'. (a) Find the Lagrangian L = T — U in terms of the coordinates 
r and r of 8. [Remember that you must first evaluate T in the inertial frame. In this connection, recall 
that v 0 = v + ft x r.j (b) Show that the three Lagrange equations reproduce (9.34) precisely. 

section 9.6 The Centrifugal Force 

9.12 * (a) Show that to design a static structure in a rotating frame (such as a space station) one can 
use the ordinary rules of statics except that one must include the extra “fictitious” centrifugal force, 
(b) I wish to place a puck on a rotating horizontal turntable (angular velocity £2) and to have it remain 
at rest on the table, held by the force of static friction (coefficient /x). What is the maximum distance 
from the axis of rotation at which I can do this? (Argue from the point of view of an observer in the 
rotating frame.) 

9.13 * Show that the angle a between a plumb line and the direction of the earth’s center is well 
approximated by tana = (/? e £2 2 sin20)/(2g), where g is the observed free-fall acceleration and we 
assume the earth is perfectly spherically symmetric. Estimate the maximum and minimum values of 
the magnitude of a. 

9.14** I am spinning a bucket of water about its vertical axis with angular velocity £2. Show that, 
once the water has settled in equilibrium (relative to the bucket), its surface will be a parabola. (Use 
cylindrical polar coordinates and remember that the surface is an equipotential under the combined 
effects of the gravitational and centrifugal forces.) 

9.15 ** On a certain planet, which is perfectly spherically symmetric, the free-fall acceleration has 
magnitude g = g 0 at the North Pole and g = Xg 0 at the equator (with 0 < X < 1). Find g(9), the free- 
fall acceleration at colatitude 9 as a function of 9. 

section 9.7 The Coriolis Force 

9.16 * The center of a long frictionless rod is pivoted at the origin and the rod is forced to rotate at a 
constant angular velocity £2 in a horizontal plane. Write down the equation of motion for a bead that 
is threaded on the rod, using the coordinates x and y of a frame that rotates with the rod (with x along 
the rod and y perpendicular to it). Solve for x(t). What is the role of the centrifugal force? What of the 
Coriolis force? 

9.17 * Consider the bead threaded on a circular hoop of Example 7.6 (page 260), working in a frame 
that rotates with the hoop. Find the equation of motion of the bead, and check that your result agrees 
with Equation (7.69). Using a free-body diagram, explain the result (7.71) for the equilibrium positions. 

9.18 ** A particle of mass m is confined to move, without friction, in a vertical plane, with axes x 
horizontal and y vertically up. The plane is forced to rotate with constant angular velocity £2 about the 
y axis. Find the equations of motion for x and y, solve them, and describe the possible motions. 

9.19** I am standing (wearing crampons) on a perfectly frictionless flat merry-go-round, which is 
rotating counterclockwise with angular velocity £2 about its vertical axis, (a) I am holding a puck at 
rest just above the floor (of the merry-go-round) and release it. Describe the puck’s path as seen from 
above by an observer who is looking down from a nearby tower (fixed to the ground) and also as seen by 
me on the merry-go-round. In the latter case explain what I see in terms of the centrifugal and Coriolis 
forces, (b) Answer the same questions for a puck which is released from rest by a long-armed spectator 
who is standing on the ground leaning over the merry-go-round. 
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9.20 ** Consider a frictionless puck on a horizontal turntable that is rotating counterclockwise with 
angular velocity £2. (a) Write down Newton’s second law for the coordinates x and y of the puck as 
seen by me standing on the turntable. (Be sure to include the centrifugal and Coriolis forces, but ignore 
the earth’s rotation.) (b) Solve the two equations by the trick of writing rj = x + iy and guessing a 
solution of the form rj = e~ lCit . [In this case — as in the case of critically damped SHM discussed in 
Section 5.4 — you get only one solution this way. The other has the same form (5.43) we found for 
the second solution in damped SHM.] Write down the general solution, (c) At time t = 0,1 push the 
puck from position r 0 = (x 0 , 0) with velocity v 0 = (v xo , v yo ) (all as measured by me on the turntable). 
Show that 


x(t) = (x 0 +v xo t)cos£lt + (v yo +Qx 0 )t sinClt | ('0 721 

y(t) = — (x 0 + v xo t) sin £2r + (v y0 + Qx 0 )t cos £2r j ’ ' ' 

(d) Describe and sketch the behavior of the puck for large values of t. [Hint: When t is large the terms 
proportional to t dominate (except in the case that both their coefficients are zero). With t large, write 
(9.72) in the form x(t) = t(B i cos £2f + B 2 sin £2r), with a similar expression for y(t), and use the trick 
of (5.11) to combine the sine and cosine into a single cosine — or sine, in the case of y(t). By now 
you can recognize that the path is the same kind of spiral, whatever the initial conditions (with the one 
exception mentioned).] 

9.21 ** When a puck slides on a rotating turntable, as in Problems 9.20 and 9.24, it can come 
instantaneously to rest. Sketch the shape of the path when this happens and explain. If you did Problem 
9.24, comment on the relevance of this result to part (d) of that problem. 


9.22 *★ If a negative charge —q (an electron, for example) in an elliptical orbit around a fixed positive 
charge Q is subjected to a weak uniform magnetic field B, the effect of B is to make the ellipse 
precess slowly — an effect known as Larmor precession. To prove this, write down the equation of 
motion of the negative charge in the field of Q and B. Now rewrite it for a frame rotating with angular 
velocity £2. [Remember that this changes both d 2 r/dt 2 and dr/dt.] Show that by suitable choice of 
£2 you can arrange that the terms involving r cancel out, but that you are left with one term involving 
B x (B x r). If B is weak enough this term can certainly be neglected. Show that in this case the orbit 
in the rotating frame is an ellipse (or hyperbola). Describe the appearance of the ellipse as seen in the 
original nonrotating frame. 

9.23 ★* Here is an unusual way to solve the two-dimensional isotropic oscillator — the motion of a 
particle subject to a force —kr. Show that by choosing a suitable rotating reference frame, you can 
arrange that the centrifugal force exactly cancels the force — kr. Recalling the analogy between the 
Coriolis and magnetic forces, you should be able to write down the general solution for the motion as 
seen in the rotating frame. If you write your solution in the complex form of Section 2.7, then you can 
transform back to the nonrotating frame by multiplying by a suitable rotating complex number. Show 
that the general solution is an ellipse. [See Problem 8.11 for some guidance on this last part.] 

9.24 *** [Computer] Use a suitable plotting program (such as ParametricPlot in Mathematica) to 
plot the orbits (9.72) of the puck of Problem 9.20 on a rotating turntable with x 0 = £2 = 1 and the 
following initial velocities v 0 : (a) (0,1), (b) (0, 0), (c) (0, -1), (d) (-0.5, -0.5), (e) (-0.7, -0.7), 
(f) (0, —0.1). Comment on any interesting features. 
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section 9.8 Free Fail and the Coriolis Force 


9.25 * A high-speed train is traveling at a constant 150 m/s (about 300 mph) on a straight, horizontal 
track across the South Pole. Find the angle between a plumb line suspended from the ceiling inside the 
train and another inside a hut on the ground. In what direction is the plumb line on the train deflected? 


9.26** In Section 9.8, we used a method of successive aproximations to find the orbit of an object 
that is dropped from rest, correct to first order in the earth’s angular velocity £2. Show in the same way 
that if an object is thrown with initial velocity v 0 from a point O on the earth’s surface at colatitude 9, 
then to first order in £2 its orbit is 


x = v xo t + £2(1^ cos 9 — u 2O sin0)r 2 + 3£2gr 3 sin0 
y = v yo t — Q(v x0 cos 9)t 2 
z = v zo t — \gt 2 + £ 2 ( 1 ^ sin 6>)t 2 . 


(9.73) 


[First solve the equations of motion (9.53) in zeroth order, that is, ignoring £2 entirely. Substitute your 
zeroth-order solution for x, y , and z into the right side of equations (9.53) and integrate to give the next 
approximation. Assume that v 0 is small enough that air resistance is negligible and that g is a constant 
throughout the flight.] 


9.27 ** In Section 9.8, we discussed the path of an object that is dropped from a very tall stepladder 
above the equator, (a) Sketch this path as seen from a tower to the north of the drop and fixed to the 
earth. Explain why the object lands to the east of its point of release, (b) Sketch the same experiment as 
seen by an inertial observer floating in space to the north of the drop. Explain clearly (from this point 
of view) why the object lands to the east of its point of release. [Hint: The object’s angular momentum 
about the earth’s center is conserved. This means that the object’s angular velocity 0 changes as it falls.] 

9.28 ** Use the result (9.73) of Problem 9.26 to do the following: A naval gun shoots a shell at 
colatitude 9 in a direction that is a above the horizontal and due east, with muzzle speed v Q . (a) Ignoring 
the earth’s rotation (and air resistance), find how long (?) the shell would be in the air and how far away 
( R ) it would land. If v 0 = 500 m/s and a = 20°, what are t and /?? (b) A naval gunner spots an enemy 
ship due east at the range R of part (a) and, forgetting about the Coriolis effect, aims his gun exactly 
as in part (a). Find by how far north or south, and in which direction, the shell will miss the target, in 
terms of £2 ,v 0 ,a,9, and g. (It will also miss in the east-west direction but this is perhaps less critical.) 
If the incident occurs at latitude 50° north (9 = 40°), what is this distance? What if the latitude is 50° 
south? This problem is a serious issue in long-range gunnery: In a battle near the Falkland Islands in 
World War I, the British navy consistently missed German ships by many tens of yards because they 
apparently forgot that the Coriolis effect in the southern hemisphere is opposite to that in the north. 

9.29 ** (a) A baseball is thrown vertically up with initial speed v 0 from a point on the ground at 
colatitude 9. Use the solution (9.73) of Problem 9.26 to show that the ball will return to the ground a 
distance (4£2u 0 3 sin6>)/(3g 2 ) to the west of its launch point, (b) Estimate the size of this effect on the 
equator if v 0 = 40 m/s. (c) Sketch the ball’s orbit as seen from the north (by an observer fixed to the 
earth). Compare with the orbit of a ball dropped from a point above the equator, and explain why the 
Coriolis effect moves the dropped ball to the east, but the thrown ball to the west. 

9.30 *** The Coriolis force can produce a torque on a spinning object. To illustrate this, consider a 
horizontal hoop of mass m and radius r spinning with angular velocity co about its vertical axis at 
colatitude 9. Show that the Coriolis force due to the earth’s rotation produces a torque of magnitude 
ram£2r 2 sin 9 directed to the west, where £2 is the earth’s angular velocity. This torque is the basis of 
the gyrocompass. 
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9.31 *** The Compton generator is a beautiful demonstration of the Coriolis force due to the earth’s 
rotation, invented by the American physicist A. H. Compton (1892-1962, best known as author of the 
Compton effect) while he was still an undergraduate. A narrow glass tube in the shape of a torus or 
ring (radius R of the ring » radius of the tube) is filled with water, plus some dust particles to let one 
see any motion of the water. The ring and water are initially stationary and horizontal, but the ring is 
then spun through 180°about its east-west diameter. Explain why this should cause the water to move 
around the tube. Show that the speed of the water just after the 180° turn should be 2QR cos 9, where 
£2 is the earth’s angular velocity, and 9 is the colatitude of the experiment. What would this speed be 
if R ~ 1 m and 9 = 40° ? Compton measured this speed with a microscope and got agreement within 
3%. 

9.32 ★** Do all parts of Problem 9.28, but find the distance by which the shell misses its target in both 
the north-south and east-west directions. [Hint: In this case you must recognize that the time of flight 
is affected by the Coriolis effect.] 

section 9.9 The Foucault Pendulum 

9.33 ** The general solution for the small-amplitude motion of a Foucault pendulum is given by (9.66). 
If at t = 0 the pendulum is at rest with x = A and y = 0, find the two coefficients C t and C 2 , and show 
that because £2 <?C co 0 they are well approximated as Cj = C 2 = A/2, giving the solution (9.67). 

9.34 ★** At a point P on the earth’s surface, an enormous perfectly flat and frictionless platform is 
built. The platform is exactly horizontal — that is, perpendicular to the local free-fall acceleration g P . 
Find the equation of motion for a puck sliding on the platform and show that it has the same form as 
(9.61) for the Foucault pendulum except that the pendulum’s length L is replaced by the earth’s radius 
R. What is the frequency of the puck’s oscillations and what is that of its Foucault precession? [Hints: 
Write the puck’s position vector, relative to the earth’s center O as R + r, where R is the position of 
the point P and r = (x, y, 0) is the puck’s position relative to P. The contribution to the centrifugal 
force involving R can be absorbed into g P and the contribution involving r is negligible. The restoring 
force comes from the variation of g as the puck moves.] To check the validity of your approximations, 
compare the approximate size of the gravitational restoring force, the Coriolis force, and the neglected 
term m (ft x r) x Q in the centrifugal force. 





CHAPTER 


Rotational Motion 
of Rigid Bodies 


A rigid body is a collection of N particles with the property that its shape cannot 
change — the distance between any two of its constituent particles is fixed. A perfectly 
rigid body is, of course, an idealization, but an extremely useful one, and one to which 
many real systems are good approximations. In many ways a rigid body made up of N 
particles is much simpler than an arbitrary system of N particles: The arbitrary system 
requires 3 N coordinates to specify its configuration, three coordinates for each of N 
particles. The rigid body requires only six coordinates, three to specify the position 
of the center of mass and three to specify the body’s orientation. Further, we shall see 
that the motion of a rigid body can be divided into two separate simpler problems, the 
translational motion of the center of mass and the rotation of the body around the CM. 

I shall start the chapter with some general results, mostly related to the CM of the 
body. These results generalize the results we found at the beginning of Chapter 8 for 
two particles, and most of them apply to any system of N particles. However, I shall 
quickly specialize to the motion of a rigid body. Much the most interesting aspect 
of the latter is the rotational motion, and this is what will occupy us for most of the 
chapter. 


10.1 Properties of the Center of Mass 


Consider a system of N particles a — 1, • • •, N with masses m a and positions r a 
measured relative to a chosen origin O . The center of mass of the system was defined 
in Chapter 3, Equation (3.9), to be the position (relative to the same origin O ) 

i N i f 

R = — m a r a or — /r dm (10.1) 

M “ M J 

where M denotes the total mass of all of the particles and the integral form is used 

when the system can be considered to be a continuous distribution of mass. 367 
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The Total Momentum and the CM 

Several important parameters of the system’s motion can be neatly expressed in terms 
of the CM. As we saw in Chapter 3, Equation (3.11), the total momentum is 

p = P « = Yl m = MI *- ( 10 - 2 ) 

That is, the total momentum of the system is exactly the same as that of a single 
particle of mass equal to the total mass M and velocity equal to that of the CM. If we 
differentiate this result we see that P = MR or, since P equals the net external force 
F ext on the system [as we saw in Equation (1.29)], 


F xt = MR. 


(10.3) 


That is, the CM moves just as if it were a single particle of mass M subject to the 
net external force on the system. This result is the single most important justification 
for our treating extended objects like baseballs and comets as point particles. To the 
extent that these nonpoint objects can be represented by their CM, they do move just 
like point particles. 


The Total Angular Momentum 

The role of the CM motion in a system’s total angular momentum is more complicated, 
but equally crucial. The following argument does not depend on the system being a 
rigid body, but to be definite let us consider a rigid body made up of N pieces with 
masses m a , as sketched in Figure 10.1, where the body is shown as an ellipsoid. The 
position of m a relative to an arbitrary origin O is shown as r a and that of the CM 
relative to O as R. Also shown is the position C of m a relative to the CM, which 
satisfies 


r a = R + r^. (10.4) 

The angular momentum l a of m a about the origin O is 

t a = r cr x Pa = r a x m aTa • (10.5) 

Thus the total angular momentum relative to O is 

L = = X 

If we use (10.4) to rewrite both r a and r a , we find that L is the sum of four terms: 
L=^Rx m a R + ^ R x m a r' a + ^ r' a x m a R + ^ x m a r' a . 
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Figure 10.1 A rigid body (shown here as an ellipsoid) is made 
up of many small pieces, a = 1, • • •, N. The mass of a typical 
piece is m a and its position relative to the origin O is r a . The 
position of the CM relative to O is R, and r' a denotes the position 
of m a relative to the CM, so that r a = R + rC 


If we factor out of each of these four sums the terms that do not depend on a, we find 
(remember m a = M) 


L = R x MR + R x ^ m a r ! rf + x R + x m a r' a . (10.6) 

This expression can now be simplified dramatically. Notice first that the sum in 
parentheses in the third term on the right is the position of the CM relative to the 
CM (times M). This is, of course, zero (Problem 10.1): 

y]m a r; = 0. (10.7) 

Therefore the third term in (10.6) is zero. Differentiating this relation, we see that the 
sum in the second term in (10.6) is likewise zero. Thus all that remains of (10.6) is 

L = RxP + y r 'xm a r'. (10.8) 

The first term is the angular momentum (relative to O) of the motion of the CM. 
The second is the angular momentum of the motion relative to the CM. Thus we can 
re-express (10.8) to say 


L = L(motion of CM) + L(motion relative to CM). (10.9) 


To illustrate this useful result consider the motion of a planet around the sun (which 
we can safely treat as fixed because it is so massive). In this case, (10.9) asserts that the 
total angular momentum of the planet is the angular momentum of the orbital motion 
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of the CM around the sun, plus the angular momentum of its spinning motion around 
its CM, 


L — L orb + L spin . (10.10) 

This division of the total angular momentum into its orbital and spin parts is especially 
useful because it is often true (at least to a good approximation) that the two parts are 
separately conserved. To see this, note first that since L orb = R x P, 

L orb = R x P + R x P = R x F ext (10.11) 

(since the first cross product is zero and P = F ext ). That is, L orb evolves just as if the 
planet were a point particle, with all its mass concentrated at its CM. In particular, if 
the force of the sun on the planet were perfectly central (F ext exactly collinear with R), 
then L orb would be constant. In practice the force is not exactly central (since planets 
are not pefectly spherical and the sun’s gravitational field is not perfectly uniform), 
but it is true to an excellent approximation. 

To find L spin , we can write L spin = L L orb . We already know that L = T , so 

L = J2 r « x F® xt = + R) x Ff = ^<xF“‘ + Rx F ext . (10.12) 

Subtracting (10.11) from (10.12) gives L spin , 

Lspin = L - L orb = ]T x ¥ e f = T ext (about CM); 

that is, the rate of change of L spin , the angular momentum about the CM, is just the 
net external torque, measured relative to the CM. [This natural-seeming result was 
mentioned without proof in Equation (3.28). What makes it a little surprising is that 
a reference frame attached to the CM is generally not an inertial frame. Surprising 
or not, the result is true and very useful.] Since the torque of the sun about the CM 
of any planet is very small, L spin is very nearly constant. Nevertheless, this useful 
conclusion, although an excellent approximation, is not exact. For instance, because 
of our own earth’s equatorial bulge, there is a small torque on the earth due to the sun 
(and moon), and L spin is not quite constant. The slow changing of L spin is responsible 
for the effect known as the precession of the equinoxes, the rotation of the earth’s axis 
relative to the stars by some 50 arcseconds per year. 

There is a corresponding (though not exactly analogous) division of angular mo¬ 
mentum into its orbital and spin parts in quantum mechanics. For example, the angular 
momentum of the electron orbiting around the proton in a hydrogen atom is made up 
of two terms as in (10.10), and for much the same reasons each separate kind of an¬ 
gular momentum is almost perfectly conserved. Here too, this useful result is only 
approximately true: In this case there is a weak magnetic torque on the electron and 
neither the spin nor the orbital angular momentum is exactly conserved (although the 
total angular momentum is). 
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Kinetic Energy 

The total kinetic energy of N particles is 

N 

r = ]T±m a r a 2 . (10.13) 

<x =1 

As before, we can use (10.4) to replace r a by R + r^, which gives 
r a 2 = (R + r'J 2 = R 2 + f 2 + 2R-r' a 

and hence 

T = |^m a R 2 + |^m a r^ 2 + (10.14) 

The sum in the last term on the right is zero by (10.7) and we find that 

T = \MR 2 + (10.15) 

or 


\T = T (motion of CM) 4- T (motion relative to CM). (10.16) 


For a rigid body, the only possible motion relative to the CM is rotation. Thus we can 
rephrase this result to say that 

T — T (motion of CM) + T (rotation about CM). (10.17) 

This useful result says, for example, that the kinetic energy of a wheel rolling down 
the road is the translational energy of the CM plus the energy of the rotation about the 
axle. 

From (10.14) we can derive an alternative and sometimes useful expression for 
the total kinetic energy. The derivation of (10.14) did not depend on R being the CM 
position, and (10.14) is actually valid for any point R fixed in the body. In particular, 
suppose we choose R to be a point of the body that happens to be at rest (even just 
instantaneously at rest). In this case the first and third terms on the right of (10.14) are 
both zero, and we find that 


(10.18) 

This says that the total kinetic energy of a rigid body is just the rotational energy of 
the body relative to any point of the body that is instantaneously at rest. For example 
the kinetic energy of a rolling wheel can be evaluated as the energy of rotation about 
the point of contact with the road, since this point is instantaneously at rest. 
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Potential Energy of a Rigid Body 

If all the forces on and within an N -particle rigid body are conservative, then, as we 
saw in Section 4.10, the total potential energy can be written as 

U = U ext + U int (10.19) 

where U cxt is the sum of all potential energies due to any external forces. (For example, 
if the body of interest is a baseball, U ext could be the gravitational energy of all 
the particles that comprise the ball in the field of the “external” earth.) The internal 
potential energy U m is the sum of the potential energies for all pairs of particles, 

U»* = U a/) (r afi ) (10.20) 

a<P 


where r a p is the distance 1 between particles a and fi. However, in a rigid body all 
of the interparticle distances are fixed. Therefore, the internal potential energy 
is a constant and may as well be ignored. In other words, in discussing the motion 
of a rigid body we have to consider only the external forces and their corresponding 
potential energies. 


10.2 Rotation about a Fixed Axis 


The results of the previous section show the importance of rotational motion. For 
example, the kinetic energy of any extended body flying through the air (you might 
think of a drum major’s twirling baton) is the sum of two terms: the translational energy 
of the CM and the rotational energy of its spinning about the CM. The former we 
understand rather completely, but the latter we need to study. In most of the remainder 
of this chapter, we shall be focussing on rotational motion. 

In this section, we’ll start with the special case of a body that is rotating about a 
fixed axis, such as the piece of wood shown in Figure 10.2 spinning on a fixed rod, 
and first calculate its angular momentum. 

Because the axis of rotation is fixed, we can agree to call it the z axis, with the 
origin O somewhere on the axis of rotation. As usual we imagine the body divided 
into many small pieces with masses m a (a = 1, • • •, N) and the angular momentum 
is given by the usual formula 

L = = x m 0l \ a (10.21) 


1 Throughout this chapter I shall assume that all the internal forces are central. This guarantees 
that U *P depends only on the magnitude of r a p, not on its direction. It also ensures that the internal 
forces never contribute to changes in the total angular momentum. 
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Figure 10.2 An egg-shaped block of wood with a hole drilled 
through it is threaded on a rod fixed on the z axis. The block is 
spinning with angular velocity (o. 


where the velocities \ a are the velocities with which the pieces of the body are being 
carried in circles by the body’s rotational velocity co. We saw in (9.22) that these are 
just v a = co x r a . With our z axis along co, the components of co are 

co - (0, 0 , co ) 


and 


r a = (*«, y a , Z a ). 

Thus v a = wxr a has components 

\ a = co x r a — (—(oy a , cox a , 0) 

and finally 

l a , = m a r a x \ a — m a (o(—z ce x 0l , —z a y a , x£ + y£). (10.22) 

At last we are ready to calculate the total angular momentum of our spinning solid, 
and I shall start with the z component. If we put the z component of (10.22) into 
(10.21), we find that 


L z = J2 + yl)co. (10.23) 

Now, the quantity (x£ + y£) is the same as p£, where, as usual, p = -y/x 2 + y 2 denotes 
the distance of any point (x, y, z) from the z axis. Therefore, 

L z = J2 m apfa = (10.24) 


where 


h = Yl m «Pc 


( 10 . 25 ) 
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is the familiar moment of inertia about the z axis, as defined in every introductory 
physics course — the sum of all the constituent masses, each multiplied by the square 
of its distance from the z axis. 2 Thus we have proved the familiar result that 

(angular momentum) = (moment of inertia) x (angular velocity). 

Note, however, that the angular momentum on the left of (10.24) is actually L z , and 
the moment of inertia is, of course, that for rotation about the z axis. 

To reinforce this gratifying result, let us calculate the kinetic energy of our rotating 
body. This is 


T = \Y, m * V a' 

or, since the speed of m a as it is carried in a circle around the z axis with angular 
velocity co is v a = p a co, 


T = | 5> aP V = \l z co 2 , (10.26) 

another familiar result from introductory physics. 

So far we have met no surprises, but when we calculate the x and y components 
of L we find something unexpected: Substituting the x and y components of (10.22) 
into (10.21), we find for the x and y components of L: 

L x = — m a x a z a (D and L y = — ^m a y a z a &>. (10.27) 

As we shall see in a moment the sums here are in general not zero, and we have 
the following surprising conclusion: The angular velocity co points in the z direction 
(the body rotates about the z axis), but, since L x and L y can be nonzero, the angular 
momentum L may be in a different direction. That is, the angular momentum may not 
be in the same direction as the angular velocity, and the relation L = 1(0 that you may 
have learned in introductory physics is generally not true! 

To better understand this rather unexpected conclusion, consider a rigid body that 
consists of a single mass m on the end of a massless rod, pivoted about the z axis at 
a fixed angle a, as in Figure 10.3. As this body rotates about the z axis, it is easy to 
see that the mass m has velocity v into the page (negative x direction) and hence that 
L = r x mv is in the direction shown, at an angle (90° — a) with the z axis. Clearly 
L y is not equal to zero, and, even though the body is rotating about the z axis, the 
angular momentum is not in that direction. In other words, L is not parallel to co. 

This example is worth pursuing a little further. It is clear from the picture that as 
the body rotates steadily about the z axis, the direction of L changes. (Specifically, L 
itself sweeps around the z axis.) Therefore, L^0, and a torque is required simply to 
keep the body rotating steadily. This conclusion, at first rather surprising, is actually 
easy to understand: The required torque is in the direction of L, which is out of the 
page (positive x direction) in Figure 10.3; that is, the torque must be counterclockwise. 
The easiest way to understand this is to put yourself in a frame that is rotating with 


2 You may recall that, when actually calculating moments of inertia, we often replace the sum 
in (10.25) by an integral. For now, however, I shall continue to write moments of inertia as sums. 
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Figure 10.3 A rigid rotating body comprising a single mass m 
anchored to the z axis by a massless rod at a fixed angle a, 
shown at a moment when m happens to lie in the yz plane. As 
the body rotates about the z axis, m has velocity, and hence 
momentum, into the page (in the negative x direction) at the 
moment shown. Therefore the angular momentum L = rxp 
is directed as shown and is certainly not parallel to the angular 
velocity co. 


the body. In this frame, the mass m experiences a centrifugal force out from the z 
axis (to the right in the picture). Therefore a counterclockwise torque is required to 
prevent the rod that holds m from bending or breaking away from its anchor on the 
axis. 

When a body, such as the wheel of your car coasting along the highway, is rotating 
steadily about a fixed direction, we usually do not want to have to exert any torque 
on it. This means the body must be designed so that its angular momentum is parallel 
to its angular velocity. In the case of your car, this is guaranteed by the process of 
dynamical balancing of the wheels. If the wheels are not properly balanced, you are 
made quickly aware of it by a disagreeable vibration of the car. More generally, the 
question whether or not L and co are parallel is an important issue throughout the study 
of rotating bodies and leads us to the important concept of principal axes, as we shall 
discuss in Section 10.4. 


The Products of Inertia 

We need to collect together our results for the angular momentum of a body rotating 
about the z axis and to streamline our notation. It is clear from (10.27) that L x and L y 
are proportional to co, and the constants of proportionality are generally denoted by 
I xz and I yz . Thus I shall rewrite (10.27) as 


L y = I yz co 


L x = I xz co and 


(10.28) 





376 


Chapter 10 Rotational Motion of Rigid Bodies 


where 

4 z = -^ m a x a z cc and Iyz = -Yl m <xy°‘ Z 0‘- (10.29) 

The two coefficients I xz and l yz are called the products of inertia of the body. The 
rationale for this new notation is that I xz tells us the x component of L when co is in the 
z direction (and likewise for I yz ). To conform with this notation, we have to rename 
I z , the old-fashioned moment of inertia about the z axis and call it I zz , 

4 = m aPa = + 4 2 )- (10.30) 

With this notation, we can say that, for a body rotating about the z axis, the angular 
momentum is 

L = ( I xz (o , I yz co, I zz co). (10.31) 

Obviously, it is important to be able to calculate the coefficients I xz , I yz , and I zz , for 
bodies of different shapes. I shall work out some examples in the remainder of this 
chapter and there are plenty more in the problems. Here are three simple examples 
for a start. 

example io.i Calculating Some Simple Moments 
and Products of Inertia 

j Calculate the moment and products of inertia for rotation about the z axis of the 
j following rigid bodies: (a) A single mass m located at the position (0, y 0 , z 0 ) as 
j shown in Figure 10.3. (b) The same as in part (a) but with a second equal mass 
placed symmetrically below the xy plane, as in Figure 10.4(a). (c) A uniform 
ring of mass M and radius p Q centered on the z axis and parallel to the xy plane, 

| as in Figure 10.4(b). 



,z 



1 

} (0, Vo. Zo) 






M0,v„,-z o ) 



(a) (b) 


Figure 10.4 (a) A rigid body comprising two equal masses m held at 

equal distances above and below the xy plane and rotating about the z 
axis (shown at an instant when the two masses lie in the yz plane), (b) 
A uniform continuous ring of total mass M and radius p 0 , centered on 
the z axis and parallel to the xy plane. (Example 10.1.) 
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(a) For the single mass of Figure 10.3, the sums in (10.29) and (10.30) each 
reduce to a single term and we find 

4z = l yz = ~my 0 z 0 , and I zz - my o 2 . 

That I yz is nonzero confirms that L has a nonzero y component and hence that 
the angular momentum L is not in the same direction as the axis of rotation. 
The remaining parts (b) and (c) illustrate how when a rigid body has certain 
symmetries, the products of inertia can turn out to be zero. 

(b) For the two masses of Figure 10.4(a) each of the sums in (10.29) and 
(10.30) contains two terms and we find 

4z = - m ct X cc z a = 0 (10.32) 

because both masses have x a = 0, 

Iyz = - m <xy<* Z a = _m t>'o z o + Jo(- z o)] = 0, (10.33) 


and 


4z = X/ m <x( x a + 4?) = m (0 + y 0 2 + 0 + > 0 2 ) = 2 my o 2 . 

The interesting case here is the product of inertia I yz , which is zero because the 
contribution of the first mass is exactly cancelled by that of the second mass, at 
the “mirror image” point with the opposite sign of z a . Clearly this will happen for 
any body that has reflection symmetry in the plane z = 0; 3 both of the products 
of inertia I xz and I yz will be zero because every term in each of the sums of 
(10.32) and (10.33) will be cancelled by another term with the opposite sign of 
z a . With I xz = I yz = 0, we see from (10.31) that when the body rotates about 
the z axis, the angular momentum L is also along the z axis. 

(c) Because the body in Figure 10.4(b) is a continuously distributed mass, we 
should in general evaluate the products and moment of inertia as integrals, but in 
this case we can see the answers without actually doing any integrals. Consider, 
first, the product of inertia I xz as given by the sum in (10.32). Referring to Figure 
10.4(b), it is easy to see that this sum is zero: Each contribution to the sum from 
a small mass m a at (x a , y a , z a ) can be paired with the contribution from an equal 
mass diametrically across the circle at (— x a , —y a , z a ). This second contribution, 
with the same value of z but the opposite value for x, exactly cancels the first, 
and we conclude that the whole sum I xz is zero. By exactly the same argument, 
I yz = 0. The moment I zz is most easily evaluated in the form of the first sum in 
(10.30). Since all terms in the sum have the same value of p a (namely p 0 ), we 
can factor p£ out of the sum and are left with 

I zz - Po = M Po- 


3 We say that a body has reflection symmetry in the plane z = 0 if the mass density is the same 
at any point (x, y, z) as at the point (x, y, — z), the reflection of (x, y, z) in a mirror located in the 
plane z = 0. 
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In this example, the two products of inertia were zero because the body 
was axially symmetric about its axis of rotation, 4 and it is easy to see that the 
same result holds quite generally: If a rigid body is axially symmetric about a 
certain axis (like a well-balanced wheel about its axle or a circular cone about 
its center line), then the two products of inertia for rotation about that axis are 
automatically zero, since the terms of the sums in (10.32) and (10.33) cancel 
j in pairs. In particular, if a body is axially symmetric about a certain axis and is 
rotating about this symmetry axis, its angular momentum will be in the same 
direction. 


10.3 Rotation about Any Axis; the Inertia Tensor 


So far we have considered only a body that is rotating about the z axis. In a certain 
sense this is quite general: Whatever axis a body is rotating about, we can choose to 
call it the z axis. Unfortunately, although this statement is true, it does not tell the 
whole story. First, we are often interested in bodies that are free to rotate about any 
axis — a gyroscope’s bearings allow it to rotate about any axis, and a projectile (such 
as a baseball or a drum major’s baton thrown in the air) has the same freedom. When 
this is the case, the axis about which the body rotates may change with time. If this 
happens, then we can certainly choose as our z axis the axis of rotation at one instant, 
but a moment later our chosen z axis is almost certainly not the axis of rotation. For 
this reason alone, we must clearly examine the form of the angular momentum when 
a body is spinning about an arbitrary axis. 

The second reason that we need to consider rotation about an arbitrary axis is 
subtler, and I shall return to it later, but let me mention it briefly here. We have seen 
that in general the direction of the angular momentum of a spinning body is not the 
same as the axis of rotation. On the other hand, it sometimes happens that these two 
directions are the same. (For instance, we have seen that this is the case for an axially 
symmetric body rotating about its axis of symmetry.) When this is true, we say that 
the axis in question is a principal axis. We shall find for any given body, rotating about 
any given point, that there are three mutually perpendicular principal axes. Because 
much of our discussion of rotations is much easier when referred to these principal 
axes, we often wish to choose the principal axes as our coordinate axes. If we do this, 
then we are no longer at liberty to choose our z axis to coincide with an arbitrary axis 
of rotation. Again, we must learn to allow any axis to be the axis of rotation, and our 
first order of business is to calculate the angular momentum corresponding to such 
rotation. 


4 We say that a body is axially, or rotationally, symmetric about an axis if the mass density is the 
same at all points on any circle centered on, and perpendicular to, the axis. In terms of cylindrical 
polar coordinates ( p,<p,z ) centered on the axis in question, the mass density is independent of <p. 
Alternatively, the mass distribution is unchanged by any rotation about the axis of symmetry. 
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Angular Momentum for an Arbitrary Angular Velocity 

Let us then consider a rigid body rotating about an arbitrary axis with angular velocity 

(o = (co x , (O y , co z ). 

Before we launch into the calculation of the angular momentum, let us pause to 
consider the kind of situation to which our calculations will apply. There are in fact 
two important cases that you could keep in mind: First, it sometimes happens that a 
rigid body has one fixed point, so that its only possible motion is a rotation about that 
fixed point. For example, the Foucault pendulum is fixed at its point of support in the 
ceiling, and its only possible motion is rotation about that fixed point. Again, a top 
spinning on a table can get its tip caught in a small dent in the table, and, from then 
on, it can only rotate about the fixed position of its tip. In either case, the magnitude 
and direction of the angular velocity co can change, but the rotation will always be 
about the fixed point, which we shall naturally take to be our origin. 

The second case that you could bear in mind is that of an object that has been 
thrown in the air. In this case there is certainly not any fixed point, but we have seen 
that we can analyze the motion in terms of the motion of the CM and the rotational 
motion relative to the CM. In this case, the motion that we are now analyzing is the 
motion relative to the CM, which we shall naturally take to be the origin. 

With these examples in mind, let us calculate the body’s angular momentum, 

L = m a r a x v a 

= ^2 m a r a X (0) X r a ). (10.34) 

We can evaluate this almost exactly as we did in the previous section for the case that 
io was along the z axis: For any position r we can write down the components ofwxr 
and then r x (<w x r), with the rather ugly result (there are several ways to do this, 
one of which is the so-called BAC — CAB rule — see Problem 10 . 19 ) 

r x (co x r) = ((y 2 + z 2 )co x — xyco y — xzco z , 

— yxco x + (z 2 + x 2 )o) y — yzco z , 

- zxco x - zya) y + (x 2 + y 2 )o) z ). (10.35) 

Substituting (10.35) into (10.34), we can write down the three components of L as 
follows: 


L x — I xx O) x + I X y(Oy + I XZ (0 Z 
Ly = ly X GJ X + IyyCOy T" Iy Z <X) z 

L z — Izx^x “b ^zy^y “b 


(10.36) 


where the three moments of inertia, I xx , I yy , I zz , and the six products of inertia, 
I xy , • • •, are defined in exact parallel with the definitions (10.29) and (10.30) of the 
previous section. For example, 
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(10.37) 

and similarly for I yy and I zz , and 



; r , = - (10.38) 


and so on. 

The rather clumsy result (10.36) can be streamlined in a couple of ways: If you 
don’t mind replacing the subscripts x, y, and z with i = 1, 2, 3, then (10.36) takes the 
compact form 


L i = J2 ■ 

j =i 


(10.39) 


This suggests another way to think about (10.36) since, as you may recognize, (10.39) 
is the rule for matrix multiplication. Thus, (10.36) can be rewritten in matrix form. 
First, we introduce the 3 x 3 matrix 


(10.40) 


which is called the moment of inertia tensor 5 or just inertia tensor. In addition, let 
us agree temporarily to think of three-dimensional vectors as 3 x 1 columns made up 
of their three components; that is, we write 


L — 

L x ~ 

L y 

and 

to = 

co x 

(Oy 

(10.41) 


_L Z _ 



_ 0) z _ 



(Notice that I am now using boldface for two kinds of matrices — square 3x3 
matrices like I, and 3x1 column matrices like L and to that are really just vectors. 
You will quickly learn to distinguish the two kinds of matrix from the context.) With 
these notations. Equation (10.36) takes the very compact matrix form 


L = I*> 


(10.42) 


5 The full definition of a tensor involves the transformation properties of its elements when we 
rotate our coordinate axes (see Section 15.17), but for our present purposes it is sufficient to say that 
a three-dimensional tensor is a set of nine numbers arranged as a 3 x 3 matrix as in (10.40). 
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where the product on the right is the standard product of two matrices, the first a 3 x 3 
square matrix and the second a 3 x 1 column. 

This beautiful result is our first example of the great usefulness of matrix algebra in 
mechanics. In many areas of physics — perhaps most especially in quantum mechan¬ 
ics, but also certainly in classical mechanics — the formulation of many problems in 
matrix notation is so much simpler than in any other way, that it is absolutely essential 
that you become familiar with the basics of matrix algebra. 6 

An important property of the moment of inertia tensor (10.40) is that it is a 
symmetric matrix; that is, its elements satisfy 

/ij = hi- ( 10 - 43 ) 

Another way to say this is that the matrix (10.40) is unchanged if we reflect it in the 
main diagonal — the diagonal running from top left to bottom right. Each element 
above the diagonal (for instance / vv .) is equal to its mirror image (I yx ) below the 
diagonal. To prove this property, you have only to look at the definition (10.38) to 
see that I xy is the same as I yx and similarly with all the off-diagonal elements Iy (the 
elements with i ^ j). Yet another way to state this property is to define the transpose 
of any matrix A as the matrix A obtained by reflecting A in its main diagonal — the 
ij element of A is the ji element of A. Thus the result (10.43) means that the matrix 
I is equal to its own transpose, 

I = I. (10.44) 

This property — that the matrix I is symmetric — plays a key role in the mathematical 
theory of the moment of inertia tensor. 


example io.2 Inertia Tensor for a Solid Cube 

Find the moment of inertia tensor for (a) a uniform solid cube, of mass M and 
side a, rotating about a comer (Figure 10.5) and (b) the same cube rotating about 
its center. Use axes parallel to the cube’s edges. For both cases, find the angular 
momentum when the axis of rotation is parallel to x [that is, co — (to, 0, 0)] and 
also when <o is along the body diagonal in the direction (1, 1, 1). 

(a) Because the mass is continuously distributed we need to replace the sums 
in the definitions (10.37) and (10.38) by integrals. Thus (10.37) becomes 

I xx = f dx [ dy f dzg(y 2 + z 2 ) (10.45) 

Jo Jo Jo 

where q = M/a 3 denotes the mass density of the cube. This is the sum of two 
triple integrals, each of which can be factored into three single integrals. For 
example. 


6 The matrix operations that I shall assume you already know — matrix addition, multiplication, 
transposition, determinants, and a few more — can be found in Chapter 3 of Mary Boas, Mathemat¬ 
ical Methods in the Physical Sciences (Wiley, 1983). Some of the ideas I shall be developing in this 
and the next chapter are discussed in more detail in Chapter 10 of the same book. 
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Figure 10.5 A uniform solid cube of side a that is 
free to rotate about the comer O. 

Io dx Io dy Io dzey2=e (,£ dx ) (CM CM 

= | qo 5 = \Ma 2 . (10.46) 

The second term in (10.45) has the same value, and we conclude that 

I xx = \Ma 2 (10.47) 

with (by symmetry) the same values for l yy and I zz . 

The integral form of (10.38) for the off-diagonal elements of I is 

I xy = - [ (lx f dy f dzgxy 
Jo Jo Jo 

MCMCCMCM (ia48) 

= ~\qo . 5 = -\Ma 1 

with (again by symmetry) the same answer for all the other off-diagonal ele¬ 
ments. 

Putting all of these results together, we find for the moment of inertia tensor 
of a cube rotating about its comer 



| Ma 2 

~\Mta 2 

—\Mcr~ 


' 8 

-3 

-3" 

1 = 

-\Ma 2 

f Ma 2 

-1 Ma 2 

Ma 1 

-3 

8 

-3 


_-\Ma 2 

-\Ma 2 

| Ma 2 _ 

12 

_ —3 

-3 

8_ 


[about comer] (10.49) 

where the second, more compact, form follows from the rules for multiplying a 
matrix by a number. (Notice that, as expected, I is a symmetric matrix.) 

According to (10.42) the angular momentum L corresponding to an angular 
velocity co is given by the matrix product L = I«, where the vectors L and <o 
are understood as 3 x 1 columns made up of the three components of the vector 
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concerned. Thus if the cube is rotating about the x axis, 


Ma 2 

" 8 -3 -3“ 
—3 8 -3 


co 

0 

_ Ma 2 

8 co 

-3 co 

12 

_ —3 -3 8_ 


_ 0 _ 

12 

_ —3 co _ 


(10.50) 


or, reverting to more standard vector notation, 

L = (10.51) 


As we have come to recognize is possible, we see that in this case L is not in 
the same direction as the angular velocity co = (co, 0, 0). 

If the cube is rotating about its main diagonal, then the unit vector in the 
direction of rotation is u = (1/a/3)(1, 1,1) and the angular velocity vector is 
co = am = (w/V3)(l, 1 , 1). Thus according to (10.42), the angular momentum 
for this case is 


L = I&) = 


Ma 2 co 


Ma 2 co 



i 

i 

i 


(10.52) 


In this case, rotation about the main diagonal of the cube, we see that the angular 
momentum is in the same direction as the angular velocity. 

(b) If the cube is rotating about its center, then in Figure 10.5 we must move 
the origin O to the center of the cube. This means that all of the integrals in 
(10.45) and (10.48) run from —a/2 to a/2 instead of 0 to a. Evaluating (10.45) 
for I xx [as in (10.46)] we find 


I xx = \Ma 2 (10.53) 

and likewise I yy and I zz . When the limits in (10.48) are replaced by — a/2 and 
a/2, both of the first two integrals are zero, and we conclude that 

h y = 0 . 

As you can easily check, all of the off-diagonal elements of I work in the same 
way and are zero. We could actually have anticipated this vanishing of the off- 
diagonal elements of I based on Example 10.1. We saw there that if the plane 
z = 0 is a plane of reflection symmetry, then both I xz and I yz are automatically 
zero. (Every contribution from above the plane z = 0 was canceled by the 
contribution from the corresponding point below the plane.) Therefore, if two 
of the three coordinate planes x — 0, y — 0, and z = 0 are planes of reflection 
symmetry, this guarantees that all of the products of inertia are zero. For the cube 
(with O at its center) all three of the coordinate planes are planes of reflection 
symmetry, so it was inevitable that the off-diagonal elements of I turned out to 
be zero. 
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J Collecting results, we conclude that for a cube rotating about its center, the 
| moment of inertia tensor is 

| r i o o" 

I =\Ma 2 0 1 0 =\Ma 2 1 [about CM], (10.54) 

I L° ° u 

j where 1 denotes the 3 x 3 unit matrix, 

1 0 0 “ 

0 10. (10.55) 

0 0 1 _ 

We shall see that, because the moment of inertia tensor for rotation about the 
I center of a cube is a multiple of the unit matrix, rotations about the center of a 
cube are especially easy to analyze. In particular, the angular momentum of the 
cube (rotating about its center) is 

j L = Io> = ~Ma 2 \(iO = | Ma 2 co , (10.56) 

which implies that the angular momentum L is in the same direction as the angu¬ 
lar velocity (O, whatever the direction ofco. This simple result is a consequence 
of the high degree of symmetry of the cube relative to its center. 

You will notice that (10.56) for rotation about any axis through the center 
j of the cube agrees with (10.52) for rotation about the main diagonal through 
the comer of the cube. This is not an accident: The main diagonal through the 
center of the cube is exactly the same as the main diagonal through the corner. 
Therefore the angular momenta for rotations about these two axes have to agree. 

This example illustrated many of the features of a typical calculation of the moment 
of inertia tensor I, and you should certainly try some of the problems at the end of this 
chapter to get used to doing such calculations yourself. Meanwhile, here is one more 
example to illustrate the special features of a body that is axially symmetric. 



example 10.3 Inertia Tensor for a Solid Cone 

Find the moment of inertia tensor I for a spinning top that is a uniform solid 
cone (mass M, height h, and base radius R ) spinning about its tip. Choose the 
z axis along the axis of symmetry of the cone, as shown in Figure 10.6. For an 
arbitrary angular velocity co, what is the top’s angular momentum L? 

The moment of inertia about the z axis, l zz , is given by the integral 


where the subscript V on the integral indicates that the integral runs over the 
volume of the body, dV is an element of volume, and q is the constant mass 
density q = M/V = 3 M / (jt R 2 h ). This integral is easily evaluated in cylindrical 
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Figure 10.6 A top comprising a uniform solid cone, of mass M, 
height h, and base radius R, spins about its tip. The radius of the 
cone at height z is r = Rz/ h. (The top is not necessarily vertical, 
but, whatever its orientation, we choose the z axis along the axis 
of symmetry.) 


polar coordinates (p, 0, z) since x 2 + y 2 = p 2 . Thus the integral can be written 


p ph p2n pr 

I ZZ = Q / dV P 2 = Q / dz f d(f> pdpp 2 (10.58) 
Jv Jo Jo Jo 

where the upper limit r of the p integral is the radius of the cone at height z, as 
shown in Figure 10.6. These integrals are easily carried out and give (Problem 
10.26) 


I zz = jqMR 2 . (10.59) 

Because of the top’s rotational symmetry about the z axis, the other two 
moments of inertia, I xx and I yy , are equal. (A rotation through 90° about the 
z axis leaves the body unchanged but interchanges I xx and I yy . Therefore, 
I xx — Iyy) To evaluate I xx we write 

l xx = f dVp{y 2 + z 2 ) — f dVgy 2 + f dVpz 2 . (10.60) 
Jv Jv Jv 


1 Here is one of those horrible moments when the traditional use of the Greek letter “rho” for 
density collides with its use for the cylindrical coordinate equal to the distance from the z axis. Notice 
that I am using two different versions of the letter, q for density and p for the coordinate. Fortunately, 
this unlucky collision happens only very occasionally. If you need to refresh your memory about 
volume integrals in cylindrical coordinates, see Problem 10.26. 
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The first integral here is the same as the second term in (10.57), and, by the 
rotatonal symmetry, the two terms in (10.57) are equal. Therefore, the first term 
of (10.60) is just 7 zz /2 or ^ MR 2 . The second integral in (10.60) can be evaluated 
using cylindrical polar coordinates, as in (10.58), and gives jMh 2 . Therefore 

l xx - lyy = ( R 2 + 4 h 2 ). (10.61) 

This leaves the off-diagonal products of inertia, I xy , -, to calculate, but 
we can easily see that all of these are zero. The point is that because of the 
axial symmetry about the z axis, both of the planes x = 0 and y = 0 are planes 
of reflection symmetry. (See Figure 10.6.) By the argument given in Example 
10.1, symmetry about the plane x = 0 implies that l xy — I xz = 0. Similarly, 
symmetry about y = 0 implies that I yz = I yx = 0. Thus, symmetry about any 
two coordinate planes guarantees that all of the products of inertia are zero. 

Collecting results, we find that the moment of inertia tensor for a uniform 
cone (relative to its tip) is 



~ R 2 + Ah 2 

0 

0 1 


“Ai 

0 

0 - 


1 — 20 ^ 

0 

R 2 + 4 h 2 

0 

= 

0 

^2 

0 

, (10.62) 

0 

0 

2 R 2 _ 


_ 0 

0 

^•3 _ 



where the last form is just for convenience of discussion (and in the present case, 
it happens that A.] = X 2 ). The most striking thing about this matrix is that its off- 
diagonal elements are all zero. A matrix with this property is called a diagonal 
matrix. [The inertia tensor (10.54) for a cube about its center was also diagonal, 
but since all its diagonal elements were equal, it was actually a multiple of the 
unit matrix 1, making it an even more special case.] The important consequence 
of I being diagonal emerges when we evaluate the angular momentum L for an 
arbitrary angular velocity co = (co x , co y , co z ): 

L = I ( 0 = (Xia> x , X 2 co y , X 3 co z ). (10.63) 

While this may not look remarkable, notice that if the angular velocity co points 
along one of the coordinate axes, then the same is true of the angular momentum 
L. For example, if co points along the x axis, then co y = co z = 0 and (10.63) 
implies that 

L = loo — {X]Co x , 0, 0) (10.64) 

and L also points along the x axis. Obviously, the same thing happens if oo points 
along the y or z axes, and we see quite generally that if the inertia tensor I is 
diagonal, then L will be parallel to co whenever co points along one of the three 
coordinate axes. 
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10.4 Principal Axes of Inertia 


We have seen that in general the angular momentum of a body spinning about a point 
O is not in the same direction as the axis of rotation; that is, L is not parallel to co. We 
have also seen that, for certain bodies at least, there can be certain axes for which L 
and co are parallel. When this happens, we say that the axis in question is a principal 
axis. To express this definition mathematically, note that two nonzero vectors a and 
b are parallel if and only if a = Ab for some real number A. Thus we can define a 
principal axis of a body (about an origin O) as any axis through O with the property 
that if co points along the axis, then 


L = A co (10.65) 

for some real number A. To see the significance of the number A in this equation, we 
can temporarily choose the direction of co to be the z direction, in which case L is 
given by (10.31) as L = ( I xz a >, I yz co, I zz ao). Since L is parallel to co, the first two 
components are zero, and we conclude that L = (0, 0, I zz oo) — I zz co. Comparing with 
(10.65), we see that the number A in (10.65) is just the body’s moment of inertia about 
the axis in question. To summarize, if the angular velocity co points along a principal 
axis, then L = A<y, where A is the moment of inertia about the axis in question. 

Let us review what we already know about the existence of principal axes. We saw 
at the end of the last section that if the inertia tensor I, with respect to a chosen set of 
axes, turns out to be diagonal. 


1 = 


0 

0 


( 10 . 66 ) 


then the chosen x, y, and z axes are principal axes. Conversely, if the x, y, and z axes 
are principal axes, then it is easy to see (Problem 10.29) that I must be diagonal, as 
in (10.66). The three numbers that I have denoted by X h A 2 , and A 3 are the moments 
of inertia about the three principal axes and are called the principal moments. 

If a body has a symmetry axis through O (like the spinning top of Example 10.3), 
then that axis is a principal axis. Furthermore, any two axes perpendicular to the 
symmetry axis (like the x and y axes in Example 10.3) are also principal axes, since 
the inertia tensor with respect to these axes is diagonal. Again, if a body has two 
perpendicular planes of reflection symmetry through O (like the cube spinning about 
its center 8 in Example 10.2), then the three perpendicular axes defined by these two 
planes and O are principal axes. 

So far all our examples of principal axes have involved bodies with special sym¬ 
metries, and you may very reasonably be thinking that the existence of principal axes 
is somehow tied to a body’s having some symmetry. In fact however, this is most 


8 Of course the cube has three planes of reflection symmetry, but two are enough to guarantee the 
claim made here. For example, if the cone of Figure 10.6 were an elliptical cone, the z axis would 
no longer be an axis of rotational symmetry, but the planes x = 0 and y = 0 would still provide two 
planes of reflection symmetry, and the three coordinate axes would still be principal axes. 
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emphatically not so. Although symmetries of a body make it much easier to spot the 
principal axes, it turns out that any rigid body rotating about any point has three 
principal axes: 


Existence of Principal Axes 

For any rigid body and any point O there are three perpendicular principal axes 
through O. That is, there are three perpendicular axes through O with respect 
to which the inertia tensor I is diagonal and, hence, with the property that when 
the angular velocity o> points along any one of these axes the same is true of the 
angular momentum L. 


This surprising result is the consequence of an important mathematical theorem 
which states that if I is any real symmetric matrix (such as the inertia tensor of some 
body with respect to any chosen set of orthogonal axes) then there exists another set 
of orthogonal axes (with the same origin) such that the corresponding matrix (call 
it I'), evaluated with respect to the new axes, has the diagonal form (10.66). This 
result, which is proved in the appendix, is extremely useful, since the discussion of 
rotational motion is much simpler if it can be referred to a set of principal axes, and 
the result guarantees that this can always be done. (It may be worth mentioning right 
away, however, that the principal axes of a rigid body are naturally fixed in the body. 
Thus when we choose the principal axes as our coordinate axes, we are committing 
ourselves to using a rotating set of axes.) 

While it is not essential to see a proof of the existence of principal axes, we certainly 
do need to know how to find the principal axes, and this is what I take up in the next 
section. 


Kinetic Energy of a Rotating Body 

It is naturally important to be able to write down the kinetic energy of a rotating body. 
I shall leave it as a challenging exercise (Problem 10.33) to show that 

r = |w-L. (10.67) 

In particular, if we use a set of principal axes for our coordinate system, then L = 
(A.j£i)j, A, 2 &> 2 , AjtWj) and 


T — ^(A,j co^ -f- A-2 (t>2 A.3&)^). (10.68) 

This important result is a natural generalization of Equation (10.26), T = \I zz co 2 , for 
rotation about a fixed z axis. We shall use the result in writing the Lagrangian for a 
spinning body in Section 10.9. 
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10.5 Finding the Principal Axes; Eigenvalue Equations 

Suppose that we wish to find the principal axes of a rigid body rotating about a point 
O. Suppose further that, using some given set of axes, we have calculated the inertia 
tensor I for the body. If I is diagonal, then our axes already are the principal axes of the 
body, and there is nothing more to do. Suppose, however, that I is not diagonal. How 
are we to find the principal axes? The essential clue is Equation (10.65): If co points 
along a principal axis, then L must equal Xco (for some number X). Since L = Iw, it 
follows that o) must satisfy the equation 

loo = Xco (10.69) 

for some (as yet unknown) number X. The equation (10.69) has the form 

(matrix) x (vector) = (number) x (same vector) 

and is called an eigenvalue equation. Eigenvalue equations are among the most 
important equations of modem physics and arise in many different areas. They always 
express the same idea, that some mathematical operation performed on a vector (co in 
our case) produces a second vector (Iw here) that has the same direction as the first. A 
vector (o that satisfies (10.69) is called an eigenvector and the corresponding number 
X, the corresponding eigenvalue. 

There are actually two parts to solving the eigenvalue problem of (10.69). Usually 
we want to know the directions of co for which (10.69) is satisfied (namely the direc¬ 
tions of the principal axes), and in most cases we also want to know the corresponding 
eigenvalues X (namely, the moments of inertia about the principal axes). In practice, 
we usually solve these two parts of the problem in the opposite order — first find the 
possible eigenvalues X and then the corresponding directions of c o . 

The first step in solving the matrix equation (10.69) is to rewrite it. Since co — Ico 
(where 1 is the 3 x 3 unit matrix), Equation (10.69) is the same as Iw = Xlco, or, 
moving the right side over to the left, 

(I - XI)co = 0. (10.70) 

This is a matrix equation of the form A co = 0, where A is a 3 x 3 matrix and co is a 
vector, that is, a 3 x 1 column of numbers co x ,oo y , and co z . The matrix equation Aw = 0 
is really three simultaneous equations for the three numbers co x , oj y , and co z , and it is a 
well-known property of such equations that they have a nonzero solution if and only 
if the determinant, det(A), is zero. 9 Therefore, the eigenvalue equation (10.70) has a 
nonzero solution if and only if 


det(I — XI) = 0. (10.71) 

This equation is called the characteristic equation (or secular equation) for the 
matrix I. The determinant involved is a cubic in the number X. Therefore, the equation 


9 See, for example, Mary Boas, Mathematical Methods in the Physical Sciences (Wiley, 1983), 
page 133. 



390 Chapter 10 Rotational Motion of Rigid Bodies 


is a cubic equation for the eigenvalues A and will, in general, have three solutions, A t , 
A 2 , and A 3 , the three principal moments. For each of these values of X, the equation 
(10.70) can be solved to give the corresponding vector co , whose direction tells us the 
direction of one of the three principal axes of the rigid body under consideration. 

If you have never seen this procedure for finding the principal axes (or, more 
generally, for solving eigenvalue problems), you will certainly want to see an example 
and to work through some yourself. Here is one as a start. 


example io.4 Principal Axes of a Cube about a Corner 

Find the principal axes and corresponding moments for the cube of Example 
10.2, rotating about its comer. What is the form of the inertia tensor evaluated 
with respect to the principal axes? 

Using axes parallel to the edges of the cube, we found the inertia tensor to 
be [Equation (10.49)] 



(10.72) 


where I have introduced the abbreviation /i for the constant /i = Mar 1 12, which 
has the dimensions of moment of inertia. Since I is not diagonal, it is clear that 
our original chosen axes (parallel to the edges of the cube) are not the principal 
axes. To find the principal axes, we must find the directions of co that satisfy the 
eigenvalue equation Io> = Xa>. 

Our first step is to find the values of X (the eigenvalues) that satisfy the 
characteristic equation det(I — Al) = 0. Substituting (10.72) for I, we find that 



8/i 

-3/i 

-3/i 


"A 

0 

0" 

I- Al = 

-3/i 

8/i 

-3/i 

- 

0 

A 

0 


_ —3/i 

-3/i 

8/i _ 


_0 

0 

A _ 


~8/i — X -3 /i —3 [x 

= — 3/i 8/i — X —3/i 

—3/i -3/i 8/i —X _ 

The determinant of this matrix is straightforward to evaluate and is 

det(I - Al) = (2 /I - A)(l 1 /I - A) 2 . (10.73) 

Thus the three roots of the equation det(I — Al) = 0 (the eigenvalues) are 

A! = 2/i and A 2 = A 3 = 1 l/i. (10.74) 

Notice that in this case two of the three roots of the cubic (10.73) happen to be 
equal. 

Armed with the eigenvalues, we can now find the eigenvectors, that is, the 
directions of the three principal axes of our cube rotating about its comer. These 
are determined by Equation (10.70), which we must examine for each of the 
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eigenvalues A 1; X 2 , and A 3 in turn (though in this case the last two are equal). 
Let us start with Aq, 

With X = X l = 2/x, Equation (10.70) becomes 


(I — XI) (o = jx 


6 

-3 

-3 


-3 

6 

-3 



This gives three equations for the components of oo, 

2 co x — o) y — (o z = 0 

—o) x + 2 co y — co z = 0 

— co x — (jt) y -F 2 co z — 0. 


(10.75) 


(10.76) 


Subtracting the second equation from the first we see that co x = co y , and the first 
then tells us that o> x = co z . Therefore, co x =(o y = co z , and we conclude that the 
first principal axis is in the direction (1,1,1) along the principal diagonal of the 
cube. If we define a unit vector e t in this direction, 

ei = -J=(l, 1,1), (10.77) 


then e, specifies the direction of our first principal axis. If oo points along e 1? then 
L = 1(0 = X x (t>. This says simply that the moment of inertia about this principal 
axis is A-! = 2/x = g Ma 2 . Thus our analysis of the first eigenvalue has produced 
this conclusion: One of the principal axes of a cube, rotating about its comer O, 
is the principal diagonal through O (direction e,), and the moment of inertia for 
that axis is the corresponding eigenvalue \Ma 2 . 

The other two eigenvalues are equal (X 2 = A 3 = 1 l/x), so there is just one 
more case to consider. With X = 1 l/x, the eigenvalue equation (10.70) reads 


(I - A.l)<y = ix 


~-3 

-3 

-3 


-3 

-3 

-3 


-3" 

-3 

-3 


co x 

0J y 

(O z 



This gives three equations for the components of a), but all three equations are 
actually the same equation, namely 

C0 x +(0 y + (0 z = 0. (10.78) 


This equation does not uniquely determine the direction of oo. To see what it does 
imply, notice that co x + co y + co z can be viewed as the scalar product of o> with 
the vector (1,1,1). Thus Equation (10.78) states simply that co ■ e { = 0; that is, 
co needs only to be orthogonal to our first principal axis ej. In other words, any ! 
two orthogonal directions e 2 and e 3 that are perpendicular to e , will serve as the 
other two principal axes, both with moment of inertia X 2 = X 3 = 1 l/x = \±Ma 2 . 
This freedom in choosing the last two principal axes is directly related to 
the circumstance that the last two eigenvalues A 2 and X 3 are equal; when all 
three eigenvalues are different, each one leads to a unique direction for the 
corresponding principal axis. 
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| Finally, if we were to re-evaluate the inertia tensor with respect to new axes 
! in the directions e b e 2 , and e 3 , then the new matrix I' would be diagonal, with 
j the principal moments down the diagonal, 



"Xi 

0 

0 “ 


~2 

0 

0 ~ 

I' = 

0 

% 

0 

— j^Ma 2 

0 

11 
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_ 0 

0 

^•3 _ 

_0 

0 

11 _ 


For this reason the process of finding the principal axes of a body is described 

as diagonalization of the inertia tensor. 


The last paragraph of this example illustrates a useful point: By the time we have 
found the principal axes of a body with the corresponding principal moments, there 
is no need to re-evaluate the inertia tensor with respect to the new axes. We know that 
with respect to the principal axes it is bound to be diagonal, 


I = 


0 

0 


(10.79) 


with the principal moments X )s X 2 , and X 3 down the diagonal. In general the three 
principal moments will all be different, in which case the directions of the three 
principal axes are uniquely determined and are automatically orthogonal (see Problem 
10.38). As we saw in Example 10.4, it can happen that two of the principal moments 
are equal, in which case the corresponding two principal axes can have any direction 
that is orthogonal to the third axis. (This is what happened in the Example 10.4, and 
also what happens with any body that has rotational symmetry about an axis through 
O.) If all three principal moments are the same (as with a cube or sphere about its 
center) then, in fact, any axis is a principal axis. For proofs of these statements about 
the uniqueness or otherwise of the principal axes, see Problem 10.38. 


10.6 Precession of a Top due to a Weak Torque 


We now know enough about the angular momentum of a rigid body to start solv¬ 
ing some interesting problems. We’ll start with the phenomenon of precession of a 
spinning top subject to a weak torque. 

Consider a symmetric top as shown in Figure 10.7. The axes labeled x, y, and z 
are fixed to the ground with the z axis vertically up. The top is pivoted freely at its tip 
O, and it makes an angle 6 with the vertical. Because of the top’s axial symmetry, its 
inertia tensor is diagonal, with the form 


X, 0 0 

0 Xj 0 

0 0 x 3 


(10.80) 


relative to the top’s principal axes (namely, an axis along e 3 , the symmetry axis, and 
any two orthogonal axes perpendicular to e 3 ). 
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Figure 10.7 A spinning top is made of a rod OP fastened nor¬ 
mally through the center of a uniform circular disc and freely 
pivoted at O . The total mass is M, and R denotes the CM position 
relative to O . The principal axes of the top are in the directions 
of the unit vector e 3 along the symmetry axis of the top, and any 
two orthogonal unit vectors e! and e 2 perpendicular to e 3 . The 
top’s weight Mg produces a torque, which causes the angular 
momentum L to change. 


Let us suppose first that gravity has been switched off and that the top is spinning 
about its symmetry axis, with angular velocity ao = coe 3 and angular momentum 

L = X 3 co = X 3 coe 3 . (10.81) 

Since there is no net torque on the top, L is constant. Therefore, the top will continue 
indefinitely to spin about the same axis with the same angular velocity. 10 

Now let us switch gravity back on, causing a torque T = R x Mg, with magnitude 
RMg sin# and a direction which is perpendicular to both the vertical z axis and the 
axis of the top. Let us suppose further that this torque is small. (We can ensure this 
by arranging that any or all of the parameters R, M, or g are small compared with the 
other relevant parameters of the system.) The existence of a torque implies that the 
angular momentum starts to change, since T = L. 

The changing of L implies that a) starts to change and that the components to l 
and a >2 cease to be zero. However, to the extent that the torque is small, we can 
expect and co 2 to remain small. 11 This means that Equation (10.81) remains a good 
approximation. (That is, the main contribution to L continues to be the spinning about 


10 That the angular velocity remains constant is actually not completely obvious. However, we 
shall prove it later on, so please accept it as a reasonable claim for now. 

11 Again, this plausible statement should be (and will be) proved, but let us accept it for now. 
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e 3 .) In this approximation, the torque T is perpendicular to L (since T is perpendicular 
to e 3 ), which means that L changes in direction but not in magnitude. From (10.81) 
we see that e 3 starts changing in direction, while co remains constant. Specifically, the 
equation L = T becomes 


X 3 coe 3 = R x Mg 


or, substituting R = i?e 3 and g = —gz (where z is the unit vector that points vertically 
upward), 

e 3 — z x e 3 = ft x e 3 (10.82) 

X 3 co 

where 


ft = 


MgR „ 

X 3 co 


(10.83) 


You will recognize (10.82) as saying that the axis of the top, e 3 , rotates with angular 
velocity ft about the vertical direction z. 

Our conclusion is that the torque exerted by gravity causes the top’s axis to precess, 
that is, to move slowly around a vertical cone, with fixed angle 9 and with angular 
frequency ft = RMg/X 3 oo} 2 This precession, although surprising at first glance, can 
be understood in elementary terms: In the view of Figure 10.7, the gravitational torque 
is clockwise, and the torque vector T is into the page. Since L = T, this requires that the 
change in L be into the page, which is exactly the direction of the predicted precession. 

This precession of the axis of a spinning body is an effect that you have almost 
certainly observed when playing with a child’s top. The same effect shows up in several 
other situations. For example, the earth spins on its axis, much like a spinning top, 
and the axis of spin is inclined at an angle 6 = 23° from the normal to the earth’s orbit 
around the sun. Because of the earth’s equatorial bulge, the sun and moon exert small 
torques on the earth and these torques cause the earth’s axis to precess slowly (one 
complete turn in 26,000 years), tracing out a cone of half-angle 23° around the normal 
to the orbital plane — a phenomenon known as the precession of the equinoxes. This 
means that in another 13,000 years the pole star will be some 46° away from true north. 


10.7 Euler’s Equations 


We are now ready to set up the equations of motion (or at least one form of the 
equations of motion) for a rotating rigid body. The two situations to which our 
discussion will principally apply are these: (1) A body that is pivoted about one fixed 
point, like the spinning top of Section 10.6, and (2) a body without any fixed point, like 


12 1 should perhaps emphasize again that the discussion here is an approximation, the criterion 
for its validity being that £2 co. An exact analysis shows that, if launched as described here, the 
top will also make very small oscillations called nutations in the 0 direction, although in practice 
these are quickly damped out by friction. 
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Figure 10.8 Axes used to derive Euler’s equations for a rotating 
rigid body (shown here as an egg). The axes labeled x, y, and z 
define an inertial frame, often called the space frame. The unit 
vectors, e 1; e 2 , and e 3 point along the principal axes of the body 
and define the rotating, noninertial body frame. If the body has 
no fixed point (like an egg thrown in the air) then O is normally 
chosen to be the CM of the body. If the body has a fixed pivot, 
then O is that fixed point, and we would generally choose O to 
be the origin of both reference frames. 


a drum major’s flying baton, whose rotational motion about the CM we have chosen 
to examine. The equations that I shall derive are called Euler’s equations (named 
for the same mathematician as the Euler-Lagrange equations of Chapter 6) and can 
be regarded as the rotational version of Newton’s second law in the form ma = F. 
As we shall see, there are some problems that can be easily solved using Euler’s 
equations. However, many problems are more easily solved using the Lagrangian 
approach, which we’ll take up in Section 10.10. 

Before we launch into the derivation of Euler’s equations, there is a complication 
we must now face up to. To take advantage of our understanding of the inertia tensor, 
and particularly the principal axes, we naturally want to use the principal axes of the 
body as our coordinate axes. However, because the principal axes are fixed in the 
rotating body, this inevitably involves us in using axes that rotate. We shall, therefore, 
need to use the machinery of Chapter 9 for handling rotating reference frames. The 
notation that I shall use is illustrated in Figure 10.8. For an inertial frame, relative to 
which Newton’s laws hold in their simple form, we’ll use the axes labeled x, y, and 
z. This frame is traditionally called the space frame, presumably because it is fixed 
in space — that is, inertial. The rotating frame is defined by the three unit vectors e 1<7 
e 2 , and e 3 , fixed in the body and pointing along the principal axes of the body. This 
frame, fixed in the body, is called the body frame. 

If the angular velocity of the body is co, and the principal moments of the body are 
kj, k 2 , and k 3 , then the angular momentum, as measured in the body frame, is 


L = (kjcui, k 2 co 2 , k 3 co 3 ), [in the body frame]. 


(10.84) 
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Now, if T is the torque acting on the body, we know that as seen in the space frame 
(^) = T. (10.85) 

\ Cll / S p ace 

We saw in Chapter 9 that the rates of change of any vector as seen in the two frames 
are related by (9.30) 

_ / dL 
~\dt 

= L + <y x L (10.86) 

where in the second line I have reintroduced the convention that a dot represents a 
time derivative evaluated in the rotating body frame (whose angular velocity is to , the 
angular velocity of the body itself). Substituting (10.86) into (10.85), we arrive at the 
equation of motion for the rotating body frame: 

L + «xL = r. (10.87) 

This equation is called Euler’s equation. Using (10.84), we can resolve Euler’s 
equation into its three components: 


) + &) x L 

body 



— (A-2 — X^)<02(02 
Xzdh — (X} — Xj)o>3tyj 
hd >3 - (h - X 2 )a> l (o 2 


{Euler’s equations] (10.88) 


which are often referred to as Euler’s equations. 

The three Euler equations determine the motion of a spinning body as seen in a 
frame fixed in the body. In general, they are difficult to use because the components T,, 
T 2 , and T 3 of the applied torque as seen in the rotating body frame are complicated (and 
unknown) functions of time. In fact, the main use of Euler’s equations is in the case 
that the applied torque is zero, as I shall discuss in the next section. However, there are 
a few other cases where the torque is simple enough that we can get useful information 
from Euler’s equations. For example, consider again the spinning top of Section 10.6. 
As we saw there, the gravitational torque on the top is always perpendicular to the axis 
e 3 , so T 3 is always zero. Furthermore, because of the top’s axial symmetry, the two 
moments of inertia X x and a 2 are equal. Thus the third of the Euler equations (10.88) 
reduces to 


X 3 d> 3 — 0. 


That is, the component of co along the symmetry axis is constant, a result that I stated 
as reasonable, but did not prove, in the discussion of Section 10.6. 13 


13 The other main unproved assertion of that discussion — that the components co j and co 2 remain 
small for all times — can also be understood using Euler’s equations. The components T ( and T 2 
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10.8 Euler’s Equations with Zero Torque 


Let us now consider a rotating body subject to zero torque. In this case, Euler’s 
equations (10.88) take the simple form 


AjCUj = (^2 — Xj)C02O)2 

X 2 CO 2 — (A, 3 — X^)co 3 co^ 

X 3 d) 3 = (^1 — X2)0>\(£>2 


(10.89) 


I shall discuss these equations, first for the case that the three principal moments X h 
X 2 , and X 3 are all different, and then for the case that X { — X 2 ^ X 3 (as with a spinning 
top). 


A Body with Three Different Principal Moments 

Let us suppose first that the principal moments of the body under consideration are 
all different. If at time t = 0 the body happens to be rotating about one of its principal 
axes (e 3 say), then co x = co 2 = 0 at t = 0. Now, with co, = oj 2 = 0, the right sides of all 
three Euler equations (10.89) are zero. This shows that as long as co 1 and oo 2 are zero, 
all three components of co remain constant. That is, co 1 and a> 2 remain zero and co 3 is 
constant. In other words, if the body starts out rotating about one of its principal axes, 
it will continue to do so, with constant angular velocity co . This statement applies, 
in the first instance, to the angular velocity as measured in the rotating body frame. 
However, with co along e 3 , the angular momentum is L = X 3 co, and we know that L is 
constant as seen in any inertial frame. Thus our result applies equally in any inertial 
frame: If a body that is subject to no torque is spinning initially about any principal 
axis, it will continue to do so indefinitely with constant angular velocity. 

The converse of this result is also true. If at t = 0, the angular velocity is not along 
a principal axis, then co is not constant. To see this, notice first that if co is not along 
a principal axis, then at least two of its components are nonzero. If you look at the 
Euler equations (10.89) you will see that, with two components of co nonzero, at least 
one component of co must be nonzero. (For example, if co l and co 2 are nonzero, then 
d> 3 ^ 0.) Therefore, with two of its components nonzero, co cannot be constant. 

We conclude that the only way a body with three different principal moments can 
rotate freely with constant angular velocity is by rotating about one of its principal 
axes. It is interesting to know if this kind of rotation is stable. That is, if a body is 
rotating about one of its principal axes and is given a very small kick, will it continue 
to rotate close to its original axis, or will its motion change completely? Let us suppose 
that the body is rotating about the axis e 3 , with ao l = co 2 = 0. If now we give it a tiny 
kick, then to, and co 2 will pick up nonzero values that are, at least initially, small. The 
question is whether the values of co, and a> 2 will remain small or start to grow bigger. 


of the torque are nonzero, which is what drives co l and co 2 . Nonetheless, they are both small and 
oscillate rapidly as the top spins. It is reasonably easy to see that, under these conditions, the first 
two Euler equations (10.88) imply that co 1 and co 2 also oscillate rapidly and with small amplitude. 
This is the required result. 
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To answer this, we note from the third of the Euler equations (10.89) that as long as 
both co^ and co 2 are small, oo 3 remains very small (“small x small”). Thus initially, at 
least, it is a good approximation to take co 3 to be constant. In this case, the first two 
Euler equations read 


— [(^2 “ X 3 )co 3 ] 0) 2 ] 

^2^2 = [(^-3 ~ ^l) w 3] 03 \ J 


(10.90) 


where the coefficients in square brackets are (approximately) constant. These coupled 
first-order equations for co x and co 2 are easily solved. Perhaps the simplest method is 
to differentiate the first equation once and then substitute the second, to give 


Py-- 


Kh - X 2 )(X 3 - kj 2 -| 

L—is—rh 


(10.91) 


If the coefficient in square brackets is positive, the solution for a> 1 is a sine or cosine in 
time, and a) l undergoes small oscillations, returning repeatedly to zero. According to 
the first equation of (10.90), co 2 is proportional to d> h so, under the same conditions, 
co 2 undergoes similar small oscillations. To see what these conditions imply, notice 
that the coefficient in (10.91) is positive if X 3 is larger than both X l and X 2 or smaller 
than both X 1 and X 2 . Therefore, we have shown that if the body is spinning about either 
the principal axis with largest moment or that with smallest moment, the motion is 
stable against small disturbances. 

On the other hand if X 3 lies between A.j and X 2 , the coefficient in brackets in 
(10.91) is negative, and the solution for oq is a real exponential, which moves rapidly 
away from zero. 14 Since co 2 oc a) h the same is true of co 2 , and we reach the following 
conclusion: For a freely spinning body, rotation about the principal axis with the 
intermediate moment (X 3 less than X { but more than X 2 or vice versa) is unstable. 
You can test this interesting claim with a book, held shut with a rubber band. If you 
throw it up giving it a spin about either the maximum or minimum axis, it will continue 
to spin stably about that axis. If you give it a spin about the intermediate axis, it will 
tumble wildly (and be much harder to catch). 


Motion of a Body with Two Equal Moments: Free Precession 

The complete solution of the Euler equations (10.89) for a freely rotating body with 
three different principal moments is possible, but complicated and unilluminating. If 
two of the three principal moments are equal (as with a spinning top), the correspond¬ 
ing problem is much easier and quite interesting. Let us, therefore, consider this case, 


14 There is a small subtlety that may be worth mentioning: The solution for co x is a combination 
of two exponentials, e ±0 “, and you might think that the special case of a pure decaying exponential, 
e ~ at , was an exception to my claim. However, this solution is excluded by the initial conditions: If 
the body is given a small kick from a>j = 0, then a> 1 and 6j x must have the same sign, whereas for 
the pure decaying exponential they have opposite signs. 



Section 10.8 Euler’s Equations with Zero Torque 


and suppose that the first two moments are equal, X l = X 2 . The crucial simplifying 
feature of this case is that the third of the Euler equations (10.89) becomes 

ao 3 — 0 . 

That is, co 3 , the third component of the angular velocity (as measured in the body 
frame), is constant, 


co 3 = const. 


Knowing that co 3 is constant, we can now rewrite the first two Euler equations as 


• (^i — A. 3 )&) 3 „ 

CO\ := - CO 2 — ^2 b (0 2 

X 1 

(X l - X 3 )co 3 

0) 2 — --- CO 1 = -£2 b <yj; 

Ai 


where I have defined the constant frequency 


(10.92) 


00.93) 

Aj 

(The subscript “b” stands for body, for reasons we’ll see in a moment.) The coupled 
equations (10.92) for oo x and co 2 are easily solved by the now familiar trick of setting 
<y, + io) 2 = r/, which reduces (10.92) to 


i) = -itt b r). 


whence 


= '“ h/ . 

If we choose our axes so that co l = co 0 and a> 2 = 0 at t — 0, then r] 0 = co Q and, taking 
real and imaginary parts of r], we find for the complete solution 

(O = (tL) 0 cos £2 b t, — co 0 sin £2 b t, co 3 ) (10.94) 

with co 0 and co 3 both constant. The two components and co 2 rotate with angular 
velocity £2 b , while co 3 remains constant. Since co Q and co 3 are constant, so is the angle 
a between 00 and e 3 . Therefore, as seen in the body frame, co moves steadily around a 
cone, called the body cone with angular frequency £2 b given by (10.93), as indicated 
in Figure 10.9(a). 

The angular momentum L is given by 

L = (AjtUj, X^o) 2 , X 3 co 3 ) 

= (Aj&) 0 cos £2 b r, — AjWoSin^bt, X 3 co 3 ). (10.95) 

Comparison of (10.94) and (10.95) should convince you that the three vectors, co, L, 
and e 3 lie in a single plane, with the angles between any two being constant in time. 
(We’ve shown this in the body frame, but this means that it’s true in any frame.) Thus, 
as seen in the body frame, both co and L precess around e 3 at the same rate £2 b . 
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(a) Body frame 



Figure 10.9 An axially symmetric body (shown here as a prolate 
spheroid or “egg-shaped” solid) is rotating with angular velocity co, 
not in the direction of any of the principal axes, (a) As seen in the body 
frame, both co and L precess about the symmetry axis, e 3 , with angular 
frequency £2 b given by (10.93). (b) As seen in the space frame, L is 
fixed, and both co and e 3 precess about L with frequency f2 s given by 
(10.96). 


To find what happens as seen in the space frame, note that in any inertial frame the 
vector L is constant. Therefore, as seen in the space frame the plane containing co, L, 
and e 3 must rotate about L, and the two vectors co and e 3 precess about L, as shown 
in Figure 10.9(b). In particular, in the space frame, co traces out a cone, called the 
space cone, around which the body cone rolls. I leave it as a fairly difficult exercise 
(Problem 10.46) to show that the rate of precession of co around the space cone can 
be expressed in several ways, of which the simplest is this: 

Q s = —. (10.96) 

Note well that the free precession derived here has nothing to do with any external 
torques on the spinning body. On the contrary, we have derived it for a body subject 
to no external torque. An interesting example of this precession is provided by the 
earth. I have already mentioned that the sun and moon produce a small torque on the 
earth’s equatorial bulge, and that this torque causes the precession of the equinoxes, 
the precession of the polar axis with a period of 26,000 years. But the equatorial 
bulge also means that the earth’s moment of inertia about the polar axis is larger than 
the other two principal moments by about 1 part in 300. According to (10.93) this 
should imply a precession (unless the earth’s rotation is perfectly aligned with the 
principal axis) with frequency £2 b = <u 3 /300. Since &> 3 represents one revolution per 
day, C! b should correspond to one revolution every 300 days. A tiny wobbling of the 
polar axis (by less than a second of arc) was discovered by the American amateur 
astronomer Seth Chandler (1846-1913). This Chandler wobble is apparently due to 
the free-body precession discussed here, although its period is more like 400 days, 
supposedly because the earth is not a perfectly rigid body. 
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10.9 Euler Angles* 


* Sections marked with an asterisk contain material that is usually somewhat more advanced 
and could be omitted if you are pressed for time. 

The trouble with the Euler equations (10.88) is that they refer to axes fixed in the 
body, and, except in a few very simple problems, such axes are hopelessly awkward 
to work with. We need to come up with equations of motion relative to a nonrotating 
space frame, and before we can do that we need a set of coordinates that specifiy the 
orientation of our body relative to such a frame. There are several ways to do this, all 
of them surprisingly cumbersome, but by far the most popular and useful is due to 
Euler (yet again!) and specifies the orientation of a rigid body by three Euler angles. 
In many applications, the body of interest rotates about a fixed point, and in this case 
(the only case we’ll consider in detail) we naturally choose the fixed point to be the 
origin O of both the space and the body axes. As before, the basis vectors of the space 
frame will be called x, y, and z. For the body frame, we’ll use the principal axes of the 
body, with directions e b e 2 , and e 3 . If two of the principal moments are equal, we’ll 
take them to be numbers 1 and 2, so that X, = X 2 , and we’ll refer to the third direction 
e 3 as the axis of symmetry. Let us imagine the body oriented initially with its three 
axes lying along the corresponding space axes (ej along x and so on). We’re going 
to see that by a sequence of three rotations, through angles 9, 0, and xfr about three 
different axes, we can bring the body into any assigned orientation and that the angles 
(9, 0, if) specify a unique orientation of the body. In particular, the angles 0 and 0 
will be just the polar angles of the axis e 3 relative to the space frame. 

Step (a). Starting with the body axes aligned with the space axes, we first rotate 
the body through an angle 0 about the axis z, as illustrated in the first frame of Figure 
10.10. This rotates the first and second body axes in the xy plane. In particular, the 
second body axis now points in the direction labeled e' r 

Step (b). Next we rotate the body through an angle 9 about the new axis e' 2 . This 
moves the body axis e 3 to the direction whose polar angles are 9 and 0. Evidently our 



Figure 10.10 Definition of the Euler angles 9, 0, and 0. Starting with the body axes e t , e 2 , e 3 
and spaces axes x, y, z aligned, the three successive rotations bring the body axes to any 
prescribed orientation. 
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first two steps can bring the body axis e 3 to any assigned orientation, and, with e 3 in 
position, the only remaining freedom is a rotation about e 3 . 

Step (c). Finally, we’ll rotate the body about e 3 through whatever angle \j/ is needed 
to bring the body axes e 2 and e, into their assigned directions, as shown in the third 
frame of Figure 10.10. 

The three angles (0, </>, t/0 are the Euler angles 15 that specify the body’s orienta¬ 
tion. Before we can use them, we must calculate a few parameters in terms of them, 
starting with the angular velocity co. To find co, notice that we can regard the steps of 
Figure 10.10 as defining a sequence of four reference frames, starting with the space 
axes defined by x, y, and z, and moving via two intermediate frames to wind up with 
the body axes defined by e b e 2 , and e 3 . To find the angular velocity of the body axes 
relative to the space axes, we have only to find the velocity of each of these frames 
relative to its predecessor and form their vector sum. [Remember that relative angular 
velocities add, as we observed in (9.25).] As </> varies, the frame defined by step (a) 
rotates with angular velocity <u a = <fii relative to the space axes. Similarly the angular 
velocity of the frame defined by step (b) relative to its predecessor is co b = 0e 2 , and 
that of the body frame in step (c) is <u c = fe 3 . Therefore, the required angular velocity 
of the body frame relative to the space frame is 

(o = (o a + (i^ + (o c = <pi + 9e ' 2 + xjse 3 . (10.97) 

This expresses (o in terms of a rather messy mixture of unit vectors, but it is a simple 
matter to rewrite these vectors in terms of the unit vectors of any one frame (Problem 
10.48). 

To find the angular momentum or kinetic energy, we need in general to find the 
components of co relative to the principal axes e 1? e 2 , and e 3 . However, if we are content 
to consider just the symmetric case that — X 2 , we don’t even have to do this. The 
point is that with = X 2 an Y two perpendicular axes in the plane of ei and e 2 are 
also principal axes. Thus, instead of e 1? e 2 , and e 3 we can use the axes e', e' 2 , and e 3 
where e' and e' 2 are both shown in the second frame of Figure 10.10. This choice has 
the advantage that the last two terms in (10.97) need no conversion and, since (as you 
should check) 


z = (cos$)e 3 - (sin0)ej, (10.98) 


we find that 


co = (—$ $in$)e'j + + Uj/ +■ <j> cos 9yt$. (10.99) 


15 Beware of the many different conventions used in defining Euler’s angles. The convention fol¬ 
lowed here is most popular in quantum mechanics but less so among authors of classical mechanics. 
It has the great advantage that 9 and <j) are precisely the polar angles of the body axis e 3 . 
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Knowing the angular velocity (o with respect to a set of principal axes, we can 
immediately write down the angular momentum L and the kinetic energy T. The 
angular momentum is L = (A^, X 2 oj 2 , X 3 co 3 ) (relative to any set of principal axes). 
Thus in our case 


L = (-A.,0 4* Xflet, + + 4>co$9)e 3 . (10.100) 


For future reference, note that the component of L along the body axis e 3 is just 

L 3 = X 3 a> 3 = X 3 (ijf + <pcos9). (10.101) 


Also, as you can check (Problem 10.49), the component along the space axis z is 
L z = A^sin 2 # + X 3 {f + 0 cos#)cos# (10.102) 

= A^sin 2 # + L 3 cos# (10.103) 

where in the second line I have used (10.101). Also for future reference, note that we 
can solve (10.103) for 0 in terms of 6, L z and L 3 : 


L z - L 3 cos# 
X l sin 2 0 


(10.104) 


We saw in (10.68) that the kinetic energy is T = |(A 1 w 2 + A 2 &> 2 2 + A 3 w 3 2 ). Thus, 
for a body whose first two principal moments are equal, (10.99) gives 


T = |A,#sin 2 9 +9 2 ) + {X^if +4>cos9) 2 . (10.105) 


We shall use this result in a moment to write down the Lagrangian for a spinning top. 


10.10 Motion of a Spinning Top* 

Lagrange’s Equations 

To illustrate the use of Euler angles, let us return to the symmetric spinning top 
discussed in Section 10.6 and shown in Figure 10.7. The motion of this system is 
most easily solved using the Lagrangian approach, so we’ll start by writing down the 
Lagrangian L = T — U. The kinetic energy is given by (10.105), while the potential 
energy is U = MgR cos 9. Therefore the Lagrangian of the top is 

£ = ^AjOp sin 2 # + 0 2 ) + ^A 3 (i/r + 0cos 9) 2 — MgR cos 0. (10.106) 
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With three generalized coordinates, there are three Lagrange equations. The # 
equation is 

A , x 0 — Aj0 2 sin # cos 0 — A 3 (0 + 0 cos #)0 sin 6 + MgR sin 0 [# equation]. 

(10.107) 

The 0 and 0 equations are simpler because neither 0 nor 0 appears in L, so that both 
0 and 0 are ignorable, and the corresponding generalized momenta are constants. For 
Pq this gives 

Pj, — ^ = A^sin 2 # + A 3 (0 + 0cost?)cos# = const [0 equation]. (10.108) 

dip 

Comparing with (10.102), we see that the generalized momentum p$ is just the z 
component L z of the angular momentum, and the constancy of p^ is just a statement 
that L z is conserved — a result we could have anticipated since there is no torque 
about the z axis. Similarly, for p^ we get 

p^ = = A 3 (0 + 0cos#) = const [0 equation], (10.109) 

30 

Comparing this with (10.101), we see that p ^ is the component of L along the body’s 
symmetry axis e 3 , and the constancy of p^ tells us that L 3 is conserved. An important 
consequence is that, since L 3 = A 3 co 3 , the component co 3 of the angular velocity is also 
constant (a result we proved back at the end of Section 10.7 using Euler’s equations). 

Steady Precession 

As a first application of the Lagrange equations, let us investigate whether the top can 
exhibit a precession in which the top’s axis moves around the z axis tracing a cone 
with constant angle #. From (10.104) we see that if # is constant, then so is 0. That is, 
if the top’s axis is to move around a cone of fixed angle #, it has to do so at a constant 
angular velocity 0 = £2, say. Looking next at the 0 equation (10.109), we see that, 
with 0 and # fixed, 0 must also be constant. 

The rate £2 at which the top precesses is determined by the # equation (10.107). 
If # is constant, the left side is zero, and, replacing 0 by £2, and (0+0 cos #) by a> 3 , 
we find that £2 must satisfy 

A ^ 2 cos # - A 3 <u 3 S2 + MgR = 0. (10.110) 

This is a quadratic equation for Q. Thus, as long as the roots are real, there are two 
different rates £2 at which the top can precess, for a given tilt # and given rate of spin 
co 3 . We can write down these two values of Q for any values of the other parameters, 
but the most interesting case is when the top is spinning rapidly and a) 3 is large. In 
this case, it is easy to see that the two roots are real and that one root is much smaller 
than the other. The small root is (Problem 10.53) 

A 3 W 3 


( 10 . 111 ) 
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This slow precession is precisely the motion predicted in Section 10.6 with Q, given 
by (10.83). 16 

The second, larger root of (10.110) (again asuming that co 3 is very large) is 


, ^ 3^3 

Ajcos# 


( 10 . 112 ) 


(See Problem 10.53.) Notice that this faster precession does not depend on g, so we 
would expect to observe this one even in the absence of gravity. In fact, this precession 
is precisely the free precession predicted in Section 10.8 for a symmetric body moving 
in the absence of any torques. As you can check, the value of Q predicted here is the 
same as what was called Q s in (10.96). (See Problem 10.52.) 


Nutation 

In general, as a top precesses around the vertical axis (<p changing), the angle 9 varies 
as well. Thus as the axis swings around the vertical it also nods up or down in a motion 
called nutation, from a Latin word meaning to nod repeatedly. We can investigate the 
variation of 9 using the 9 equation (10.107). The first step is to use the cp and t/r 
equations to eliminate the variables <p and rjr in favor of the constants — L z and 
p^ — L 3 . This gives a second-order, ordinary differential equation for 9, which can, 
at least in principle, be solved to give the time dependence of 9. 

To get a qualitative picture of the motion, a simpler procedure is to look at the 
total energy, E = T + U, with T given by (10.105) and U = MgR cos 9. In T, we 
can replace the variables (p and t/r in favor of the constants L z and L 3 , and we find 
(Problem 10.51) 


E = \x x e 2 + U tfi (6) (10.113) 

where the effective potential energy U cff (6) is 

(L 7 -L 3 cos0) 2 £? 

U eff (9) = ■ - . r -i~ + + MgR cosd. (10.114) 

2A. 1 sin 9 2a 3 

Equation (10.113) serves to emphasize that our problem has been reduced to a one¬ 
dimensional problem, involving just the coordinate 9. We can predict the qualitative 
behavior of 9 by looking at a graph of t/ eff (0). The coordinate 6 ranges from 0 to 
n, and, because of the factor of sin 2 d in the denominator of the first term, U c{{ (9) 
approaches +oo at the two ends, 9 = 0 and it. It is not hard to convince oneself that 
the graph of t/ eff (9) has the “U” shape shown in Figure 10.11. From (10.113), it is clear 
that E > Utff(9), so 9 is confined between the two turning points, 9 ] and 0 2 , where 
E = U ef f(9) as shown in the figure. Evidently, 9 oscillates periodically, or “nutates,” 
between 9 X and 9 2 at the same time that the top’s axis precesses around the vertical. 


16 Here the denominator has a factor o) 3 where (10.83) has co, but this difference is irrelevant 
since both discussions assume that o> 3 is very large, so that <u 3 ~ co. 
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£4ff 



Figure 10.11 The effective potential energy (10.114) that deter¬ 
mines the time dependence of 6 for the symmetric top. Since E 
can be no less than t/ eff (6), the motion is confined to the interval 

6 >! < e < e 2 . 


The details of the motion depend on just how 4> varies. According to (10.104), 


- L 3 cos0 
sin 2 0 


(10.115) 


which shows that there are two main possibilities: If L z is larger than L 3 (in magni¬ 
tude), then 4> cannot vanish. Thus, although <p may vary, it can never change sign, so 
4> changes in the same direction all the time (always increasing or always decreasing). 
Thus, the top precesses in a single direction, while its angle of tilt 9 oscillates between 
9 ] and 9 2 , producing the motion sketched in Figure 10.12(a). If L z is smaller than L 3 , 
then (p would vanish at an angle 6 0 , such that L z — L 3 cos 9 0 = 0. If this angle lies out- 




Figure 10.12 Nutation of a top. The top end of the spinning top moves on a 
sphere centered on the pivot at the bottom end. (a) Here <p never vanishes, and 
4> always moves steadily in one direction while 6 oscillates between 6 l and 6 2 . 
(b) If <p changes sign, then </> moves first forward and then backward while 0 
oscillates. 
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side the range between 9 } and 0 2 to which the motion is confined, then 0 still cannot 
vanish and the motion is still as sketched in Figure 10.12(a). On the other hand, if the 
angle 0 o lies between 9 X and d 2 , then <p will change sign twice in each oscillation of 
9. In this case, the precession moves first in one direction, then in the other, and the 
overall motion is as sketched in Figure 10.12(b). 


Principal Definitions and Equations of Chapter 10 _ 

CM and Relative Motions 

L -= L(motion of CM) + L (motion relative to CM). [Eq. (10.9)] 
and 


T = T (motion of CM) + T (motion relative to CM). [Eq. (10.16)] 


The Moment of Inertia Tensor 

The angular momentum L and angular velocity w of a rigid body are related by 

L = Iw [Eq.( 10.42)] 

where L and co must be seen as 3 x 1 columns and I is the 3 x 3 moment of inertia 
tensor, whose diagonal and off-diagonal elements are defined as 

I xx = + z£), etc. and I xy = - etc. 

[Eqs. (10.37) & (10.38)] 


Principal Axes 

A principal axis of a body (about a point O ) is any axis through O with the property 
that if co points along the axis, then L is parallel to co; that is, 

L = Xco [Eq. (10.65)] 

for some real number X. For any body and any point O, there are three perpendicular 
principal axes through O. [Section 10.4 and Appendix] 

Evaluated with respect to its principal axes, the inertia tensor has the diagonal 
form 


I = 


Xj 0 0 

0 X 2 0 

0 0 X, 


[Eq. (10.79)] 
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Euler’s Equations 

If L denotes the rate of change of a body’s angular momentum as seen in a frame fixed 
in the body (body frame), then it satisfies Euler’s equations 

L + 0 ) X L = r. [Eqs. (10.87) & (10.88)] 


Euler’s Angles 

The orientation of a rigid body can be specified by the three Euler angles 9,tp, 0 
defined in Figure 10.10. [Sections 10.9 & 10.10] 

The Lagrangian for a rigid body spinning about a fixed pivot is 

£ = ^Xftp 2 sin 2 9 + 9 2 ) + 5 X 3 ( 1 ^ +0cos0 ) 2 - MgR cos#. [Eq. (10.106)] 


Problems for Chapter 10 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (+**). 

section 10.1 Properties of the Center of Mass 

10.1 * The result (10.7), that m a r ' a — 0, can be paraphrased to say that the position vector of the CM 
relative to the CM is zero, and, in this form, is nearly obvious. Nevertheless, to be sure you understand 
the result, prove it by solving (10.4) for r^ and substituting into the sum concerned. 

10.2 ★ To illustrate the result (10.18), that the total KE of a body is just the rotational KE relative to 
any point that is instantaneously at rest, do the following: Write down the KE of a uniform wheel (mass 
M) rolling with speed v along a flat road, as the sum of the energies of the CM motion and the rotation 
about the CM. Now write it as the energy of the rotation about the instantaneous point of contact with 
the road and show that you get the same answer. (The energy of rotation is \ loo 2 . The moment of inertia 
of a uniform wheel about its center is / = \MR 2 . That about a point on the rim is /' - \MR 2 .) 

10.3 ★ Five equal point masses are placed at the five comers of a square pyramid whose square base 
is centered on the origin in the xy plane, with side L, and whose apex is on the z axis at a height H 
above the origin. Find the CM of the five-mass system. 

10.4 ** The calculation of centers of mass or moments of inertia usually involves doing an integral, 
most often a volume integral, and such integrals are often best done in spherical polar coordinates 
(defined back in Figure 4.16). Prove that 

J dV fir) = J r 2 dr J sin Odd J dtp f(r,9,<t>). 

[Think about the small volume dV enclosed between r and r + dr, 9 and 9 + d9, and tp and 0 + dtp.] 
If the volume integral on the left runs over all space, what are the limits of the three integrals on the 
right? 

10.5 *★ A uniform solid hemisphere of radius R has its flat base in the xy plane, with its center at the 
origin. Use the result of Problem 10.4 to find the center of mass. [Comment: This and the next two 
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problems are intended to reactivate your skills at finding centers of mass by integration. In all cases, 
you will need to use the integral form of the definition (10.1) of the CM. If the mass is distributed 
through a volume (as here), the integral will be a volume integral with dm = q dV .] 

10.6 ** (a) Find the CM of a uniform hemispherical shell of inner and outer radii a and b and mass M 
positioned as in Problem 10.5. [See the comment to Problem 10.5 and use the result of Problem 10.4.] 
(b) What becomes of your answer when a = 0? (c) What \fb—> al 

10.7 ★* A “rounded cone” is made by cutting out of a uniform sphere of radius R the volume with 
0 < 0 o , where 9 is the usual angle measured from the polar axis and 9 0 is a constant between 0 and 
7 t. (a) Describe this cone and use the result of Problem 10.4 to find its volume, (b) Find its CM and 
comment on your results for the cases that 9 0 = n and 0 O —»• 0. 

10.8 ** A uniform thin wire lies along the y axis between y = ±L/2. It is now bent toward the left 
into an arc of a circle with radius R, leaving the midpoint at the origin and tangent to the y axis. Find 
the CM. [See the comment to Problem 10.5. In this case the integral is a one-dimensional integral.] 
Comment on your answer for the cases that R oo and that 2 ttR = L. 

section io .2 Rotation about a Fixed Axis 

10.9 * The moment of inertia of a continuous mass distribution with density q is obtained by converting 
the sum of (10.25) into the volume integral f p 2 dm = f p 2 pdV. (Note the two forms of the Greek 
“rho”: p = distance from z axis, Q = mass density.) Find the moment of inertia of a uniform circular 
cylinder of radius R and mass M for rotation about its axis. Explain why the products of inertia are 
zero. 

10.10 * (a) A thin uniform rod of mass M and length L lies on the x axis with one end at the origin. 
Find its moment of inertia for rotation about the z axis. [Here the sum of (10.25) must be replaced by 
an integral of the form f x 2 p. dx where /x is the linear mass density, mass/length.] (b) What if the rod’s 
center is at the origin? 

10.11 ** (a) Use the result of Problem 10.4 to find the moment of inertia of a uniform solid sphere 
(mass M, radius R) for rotation about a diameter, (b) Do the same for a uniform hollow sphere whose 
inner and outer radii are a and b. [One slick way to do this is to think of the hollow sphere as a solid 
sphere of radius b from which you have removed a sphere of the same density but radius a.] 

10.12 ★* A triangular prism (like a box of Toblerone) of mass M, whose two ends are equilateral 
triangles parallel to the xy plane with side 2 a, is centered on the origin with its axis along the z axis. 
Find its moment of inertia for rotation about the z axis. Without doing any integrals write down and 
explain its two products of inertia for rotation about the z axis. 

10.13 ** A thin rod (of width zero, but not necessarily uniform) is pivoted freely at one end about 
the horizontal z axis, being free to swing in the xy plane (x horizontal, y vertically down). Its mass is 
m, its CM is a distance a from the pivot, and its moment of inertia (about the z axis) is I. (a) Write 
down the equation of motion L z = T z and, assuming the motion is confined to small angles (measured 
from the downward vertical), find the period of this compound pendulum. (“Compound pendulum” 
is traditionally used to mean any pendulum whose mass is distributed — as contrasted with a “simple 
pendulum,” whose mass is concentrated at a single point on a massless arm.) (b) What is the length of 
the “equivalent” simple pendulum, that is, the simple pendulum with the same period? 

10.14 ** A stationary space station can be approximated as a hollow spherical shell of mass 6 tonnes 
(6000 kg) and inner and outer radii of 5 m and 6 m. To change its orientation, a uniform flywheel (radius 
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10 cm, mass 10 kg) at the center is spun up quickly from rest to 1000 rpm. (a) How long will it take 
the station to rotate by 10 degrees? (b) What energy was needed for the whole operation? [To find the 
necessary moment of inertia, you could do Problem 10.1 L] 

10.15 ★* (a) Write down the integral (as in Problem 10.9) for the moment of inertia of a uniform cube 
of side a and mass M, rotating about an edge, and show that it is equal to § Ma 2 . (b) If I balance the 
cube on an edge in unstable equilibrium on a rough table, it will eventually topple and rotate until it hits 
the table. By considering the energy of the cube, find its angular velocity just before it hits the table. 
(Assume the edge does not slide on the table.) 

10.16 ** Find the moment of inertia for a uniform cube of mass M and edge a as in Problem 10.15 
and then do the following: The cube is sliding with velocity v across a flat horizontal frictionless table 
when it hits a straight very low step perpendicular to v, and the leading lower edge comes abruptly to 
rest, (a) By considering what quantities are conserved before, during, and after the brief collision, find 
the cube’s angular velocity just after the collision, (b) Find the minimum speed v for which the cube 
rolls over after hitting the step. 

10.17 *★ Write down the integral for the moment of inertia of a uniform ellipsoid with surface 
(x/a) 2 + ( y/b) 2 + (z/c) 2 = 1 for rotation about the z axis. One simple way to do the integral is to 
make a change of variables to t, = x/a, r\ = y/b, and £ = z/c. Each of the two resulting integrals can 
be related to the corresponding integrals for a sphere (as in Problem 10.11). Do this. Check your answer 
for the case a = b = c. 

10.18 Consider the rod of Problem 10.13. The rod is struck sharply with a horizontal force F which 
delivers an impulse F At = £ a distance b below the pivot, (a) Find the rod’s angular momentum, and 
hence momentum, just after the impulse, (b) Find the impulse t) delivered to the pivot, (c) For what 
value of b (call it b 0 ) is rj = 0? (The distance b 0 defines the so-called “sweet spot.” If the rod were a 
tennis racket and the pivot your hand, then if the ball hits the sweet spot, your hand would experience 
no impulse.) 

section io .3 Rotation about Any Axis; the Inertia Tensor 

10.19 ★ Verify that the components of the vector r x (a> x r) are given correctly by Equation (10.35). 
Do this both by working with components and by using the so-called BAC — CAB rule, that is 
A x (B x C) = B(A-C) -C(AB). 

10.20 ★ Show that the inertia tensor is additive, in this sense: Suppose a body A is made up of two 

parts B and C. (For instance, a hammer is made up of a wooden handle wedged into a metal head.) 
Then I A = \ B + I c . Similarly, if A can be thought of as the result of removing C from B (as a hollow 
spherical shell is the result of removing a small sphere from inside a larger sphere), then I A = — I c . 

10.21 ** The definition of the inertia tensor in Equations (10.37) and (10.38) has the rather ugly feature 
that the diagonal and off-diagonal elements are defined by completely different equations. Show that 
the two definitions can be combined into the single equation (which is slightly less messy in integral 
form) 


hj = J Q(r 2 8 u - r irj )dV 
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where 8^ is the Kronecker delta symbol 


8 


ij 


1 * — j 

0 i^j. 


(10.116) 


10.22 ** A rigid body comprises 8 equal masses m at the comers of a cube of side a, held together by 
massless struts, (a) Use the definitions (10.37) and (10.38) to find the moment of inertia tensor I for 
rotation about a comer O of the cube. (Use axes along the three edges through O.) (b) Find the inertia 
tensor of the same body but for rotation about the center of the cube. (Again use axes parallel to the 
edges.) Explain why in this case certain elements of I could be expected to be zero. 

10.23 ** Consider a rigid plane body or “lamina,” such as a flat piece of sheet metal, rotating about a 
point O in the body. If we choose axes so that the lamina lies in the xy plane, which elements of the 
inertia tensor I are automatically zero? Prove that I zz = I xx + l yy . 

10.24 ** (a) If I cm denotes the moment of inertia tensor of a rigid body (mass M) about its CM, and 
I the corresponding tensor about a point P displaced from the CM by A = (£, rj, £), prove that 


I xx = I™ + M(r ] 2 + i; 2 ) and I yz = I™ - Mrrf, (10.117) 

and so forth. (These results, which generalize the parallel-axis theorem that you probably learned in 
introductory physics, mean that once you know the inertia tensor for rotation about the CM, calculating 
it for any other origin is trivially easy.) (b) Confirm that the results of Example 10.2 (page 381) fulfill 
the identities (10.117) [so that the calculations of part (a) of the example were actually unnecessary], 

10.25 ** (a) Find all nine elements of the moment of inertia tensor with respect to the CM of a uniform 
cuboid (a rectangular brick shape) whose sides are 2a, 2b, and 2c in the x, y, and z directions and whose 
mass is M. Explain clearly why you could write down the off-diagonal elements without doing any 
integration, (b) Combine the results of part (a) and Problem 10.24 to find the moment of inertia tensor 
of the same cuboid with respect to the comer A at (a, b, c). (c) What is the angular momentum about 
A if the cuboid is spinning with angular velocity a> around the edge through A and parallel to the x 
axis? 


10.26 *★ (a) Prove that in cylindrical polar coordinates a volume integral takes the form 

j dV f( r) = J p dp J d(f) J dzf(p,4>,z). 

(b) Show that the moment of inertia of a uniform solid cone pivoted at its tip and rotating about its 
axis is given by the integral (10.58), explaining clearly the limits of integration. Show that the integral 
evaluates to ^ MR 2 . (c) Prove also that I xx = ±M(R 2 + Ah 2 ) as in Equation (10.61). 

10.27 Find the inertia tensor for a uniform, thin hollow cone, such as an ice-cream cone, of mass 
M, height h, and base radius R, spinning about its pointed end. 

10.28 **★ Find the moment of inertia tensor I for the triangular prism of Problem 10.12 with height h. 
(If you did Problem 10.12, you’ve already done about half the work.) Your result should show that I has 
the form we’ve found for an axisymmetric body. This suggests what is true, that three-fold symmetry 
about an axis (symmetry under rotations of 120 degrees) is enough to ensure this form. 
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section 10.4 Principal Axes of Inertia 

10.29* Prove that if the axes Ox, Oy, and Oz are principal axes of a certain rigid body, then the 
inertia tensor (with respect to these axes) is diagonal with the principal moments down the diagonal as 
in ( 10 . 66 ). 

10.30 * Consider a lamina, such as a flat piece of sheet metal, rotating about a point O in the body. 
Prove that the axis through O and perpendicular to the plane is a principal axis. [Hint: See Problem 
10.23.] 

10.31 ** Consider an arbitrary rigid body with an axis of rotational symmetry, which we’ll call z. 
(a) Prove that the axis of symmetry is a principal axis, (b) Prove that any two directions x and 
y perpendicular to z and each other are also principal axes, (c) Prove that the principal moments 
corresponding to these two axes are equal: X x = X 2 . 

10.32 ★★ Show that the principal moments of any rigid body satisfy X 3 < X x + X 2 . [Hint: Look at the 
integrals that define these moments.] In particular, if X x = X 2 , then X 3 < 2X x . (b) For what shape of 
body is X 3 = X x + X 2 ? 

10.33 *** Here is a good exercise in vector identities and matrices, leading to some important general 
results: (a) For a rigid body made up of particles of mass m a , spinning about an axis through the origin 
with angular velocity go, prove that its total kinetic energy can be written as 

T = iYl m ^ cor ^ 2 ~ 

Remember that v„ = to x r^. You may find the following vector identity useful: For any two vectors 
a and b, 


(a x b ) 2 =s a 2 b 2 — (a -b) 2 . 


(If you use the identity, please prove it.) (b) Prove that the angular momentum L of the body can be 
written as 


L = y ^m a [G>r a 2 — r a (o>-r a )]. 

For this you will need the so-called BAC — CAB rule, that A x (B x C) = B(A • C) — C(A • B). 
(c) Combine the results of parts (a) and (b) to prove that 

T = j(o • L = jiolo). 

Prove both equalities. The last expression is a matrix product; go denotes the 3 x 1 column of numbers 
(o x ,a> y , co z , the tilde on go denotes the matrix transpose (in this case a row), and I is the moment of inertia 
tensor. This result is actually quite important; it corresponds to the much more obvious result that for 
a particle, T = • p. (d) Show that with respect to the principal axes, T = {{X x a> 2 + X 2 cd 2 + X 3 co 2 ), 

as in Equation (10.68). 


section io.5 Finding the Principal Axes; Eigenvalue Equations 

10.34 * The inertia tensor I for a solid cube is given by (10.72). Verify that det(I — 71) is as given in 
(10.73). 
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10.35 ★* A rigid body consists of three masses fastened as follows: m at (a, 0, 0), 2m at (0, a, a), and 
3 m at (0, a, —a). (a) Find the inertia tensor I. (b) Find the principal moments and a set of orthogonal 
principal axes. 

10.36 ** A rigid body consists of three equal masses (m) fastened at the positions (a, 0, 0), (0, a, 2a), 
and (0, 2a, a), (a) Find the inertia tensor I. (b) Find the principal moments and a set of orthogonal 
principal axes. 

10.37 *** A thin, flat, uniform metal triangle lies in the xy plane with its corners at (1, 0, 0), (0, 1, 0), 
and the origin. Its surface density (mass/area) is a = 24. (Distances and masses are measured in 
unspecified units, and the number 24 was chosen to make the answer come out nicely.) (a) Find the 
triangle’s inertia tensor I. (b) What are its principal moments and the corresponding axes? 

10.38*** Suppose that you have found three independent principal axes (directions e l5 e 2 , e 3 ) and 
corresponding principal moments A 1? X 2 , X 3 of a rigid body whose moment of inertia tensor I (not 
diagonal) you had calculated. (You may assume, what is actually fairly easy to prove, that all of the 
quantities concerned are real.) (a) Prove that if X i ^ Xj then it is automatically the case that e ( • e 7 = 0. 
(It may help to introduce a notation that distinguishes between vectors and matrices. For example, you 
could use an underline to indicate a matrix, so that a is the 3 x 1 matrix that represents the vector 
a, and the vector scalar product a • b is the same as the matrix product aborba. Then consider the 
number e-Ie^., which can be evaluated in two ways using the fact that both e, and e 7 - are eigenvectors 
of I.) (b) Use the result of part (a) to show that if the three principal moments are all different, then 
the directions of three principal axes are uniquely determined, (c) Prove that if two of the principal 
moments are equal, X 1 = X 2 say, then any direction in the plane of e ( and e 2 is also a principal axis 
with the same principal moment. In other words, when X x = X 2 the corresponding principal axes are 
not uniquely determined, (d) Prove that if all three principal moments are equal, then any axis is a 
principal axis with the same principal moment. 

section io.6 Precession of a Top Due to a Weak Torque 

10.39* Consider a top consisting of a uniform cone spinning freely about its tip at 1800 rpm. If its 
height is 10 cm and its base radius 2.5 cm, at what angular velocity will it precess? 

section io.7 Euler’s Equations 

10.40** (a) A rigid body is rotating freely, subject to zero torque. Use Euler’s equations (10.88) 
to prove that the magnitude of the angular momentum L is constant. (Multiply the ith equation by 
L ; = X^ and add the three equations.) (b) In much the same way, show that the kinetic energy of 
rotation T rot = -(X^^ + X 2 co 2 + A 3 &> 3 2 ), as in (10.68), is constant. 

10.41 ** Consider a lamina rotating freely (no torques) about a point O of the lamina. Use Euler’s 
equations to show that the component of a> in the plane of the lamina has constant magnitude. [Hint: 
Use the results of Problems 10.23 and 10.30. According to Problem 10.30, if you choose the direction 
e 3 normal to the plane of the lamina, e 3 points along a principal axis. Then what you have to prove is 
that the time derivative of to 2 + cu 2 is zero.] 

section io.8 Euler’s Equations with Zero Torque 

10.42 * I take a book that is 30 cm x 20 cm x 3 cm and is held shut by a rubber band, and I throw it 
into the air spinning about an axis that is close to the book’s shortest symmetry axis at 180 rpm. What 
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is the angular frequency of the small oscillations of its axis of rotation? What if I spin it about an axis 
close to the longest symmetry axis? 

10.43 ** I throw a thin uniform circular disc (think of a frisbee) into the air so that it spins with angular 
velocity co about an axis which makes an angle a with the axis of the disc, (a) Show that the magnitude 
of co is constant. [Look at Equation (10.94).] (b) Show that as seen by me, the disc’s axis precesses 
around the fixed direction of the angular momentum with angular velocity Q s = co \!4 — 3 sin 2 a. (The 
results of Problems 10.23 and 10.46 will be useful.) 

10.44 ★★ An axially symmetric space station (principal axis e 3 , and k l ~ k 2 ) is floating in free space. 
It has rockets mounted symmetrically on either side that are firing and exert a constant torque T about 
the symmetry axis. Solve Euler’s equations exactly for co (relative to the body axis) and describe the 
motion. At t = 0 take co = (co w , 0, <w 30 ). 

10.45 ** Because of the earth’s equatorial bulge, its moment about the polar axis is slightly greater 
that the other two moments, k 3 = 1.00327k! (but a, := k 2 ). (a) Show that the free precession described 
in Section 10.8 should have period 305 days. (As described in the text, the period of this “Chandler 
wobble” is actually more like 400 days.) (b) The angle between the polar axis and co is about 0.2 arc 
seconds. Use Equation (10.118) from Problem 10.46 to show that as seen from the space frame the 
period of this wobble should be about a day. 

10.46 We saw in Section 10.8 that in the free precession of an axially symmetric body the three 
vectors e 3 (the body axis), co, and L lie in a plane. As seen in the body frame, e 3 is fixed, and co and L 
precess around e 3 with angular velocity £2 b = co 3 (k l — k 3 )/kj. As seen in the space frame L is fixed 
and co and e 3 precess around L with angular velocity £2 S . In this problem you will find three equivalent 
expressions for <2 S . (a) Argue that = SS b + co. [Remember that relative angular velocities add like 
vectors.] (b) Bearing in mind that S2 b is parallel to e 3 prove that £2 S = co sin a/ sin 0 where a is the angle 
between e 3 and co and 6 is that between e 3 and L (see Figure 10.9). (c) Thence prove that 


£2 S = 


sin a 
sin# 


L Jk% + (kf - a 3 2 ) sin 2 a 

— = oo- - 

% kj 


(10.118) 


10.47 *** Imagine that this world is perfectly rigid, uniform, and spherical and is spinning about its 
usual axis at its usual rate. A huge mountain of mass 10 -8 earth masses is now added at colatitude 60°, 
causing the earth to begin the free precession described in Section 10.8. How long will it take the North 
Pole (defined as the northern end of the diameter along oo) to move 100 miles from its current position? 
[Take the earth’s radius to be 4000 miles.] 


section io.9 Euler Angles 

10.48 ** Equation (10.97) gives the angular velocity of a body in terms of an unholy mixture of unit 
vectors, (a) Find co in terms of x, y, and z. (b) Do the same in terms of e b e 2 , and e 3 . 

10.49 *★ Starting from Equation (10.100) for L, verify that L z is correctly given by Equations (10.102) 
and (10.103). 

10.50 ** Equation (10.105) gives the kinetic energy in terms of Euler angles for a body with k l = k 2 . 
Find the corresponding expression for a body whose three principal moments are all different. 
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section io.io Motion of a Spinning Top 

10.51 * Verify that the energy of a symmetric top can be written as E = \'k x 9 2 + U ef{ (9), where the 
effective potential energy is as given in (10.114). 

10.52** Consider the rapid steady precession of a symmetric top predicted in connection with 
(10.112). (a) Show that in this motion the angular momentum L must be very close to the verti¬ 
cal. [Hint: Use (10.100) to write down the horizontal component L hor of L. Show that if <fi is given by 
the right side of (10.112), L hor is exactly zero.] (b) Use this result to show that the rate of precession 
Q given in (10.112) agrees with the free precession rate £2 S found in (10.06). 

10.53** In the discussion of steady precession of a top in Section 10.10, the rates £2 at which 
steady precession can occur were determined by the quadratic equation (10.110). In particular, we 
examined this equation for the case that co 3 is very large. In this case you can write the equation as 
aQ 2 + bQ + c = 0 where b is very large, (a) Verify that when b is very large, the two solutions of this 
equation are approximately — c/b (which is small) and —b/a (which is large). What precisely does the 
condition “b is very large” entail? (b) Verify that these give the two solutions claimed in (10.Ill) and 
(10.112). 

10.54*** [Computer] The nutation of a top is controlled by the effective potential energy (10.114). 
Make a plot of t/ eff (0) as follows: (a) First, since the second term of U e[f (9) is a constant, you can 
ignore it. Next, by choice of your units, you can take MgR = 1 = kj. The remaining parameters L z 
and L 3 are genuinely independent parameters. To be definite set L z = 10 and L 3 = 8 and plot U ef[ (9) 
as a function of 9. (b) Explain clearly how you could use your graph to determine the angle 9 0 at which 
the top could precess steadily with 9 = constant. Find 9 0 to three significant figures, (c) Find the rate of 
this steady precession, £2 = 0, as given by (10.115). Compare with the approximate value of £2 given 
by (10.112). 

10.55 *** The analysis of the free precession of a symmtric body in Section 10.8 was based on Euler’s 
equations. Obtain the same results using Euler’s angles as follows: Since L is constant you may as well 
choose the space axis z so that L = Lz. (a) Use Equation (10.98) for z to write L in terms of the unit 
vectors e' 1? e' 2 , and e 3 . (b) By comparing this expression with (10.100), obtain three equations for 9, 0, 
and 0. (c) Hence show that 6 and 0 are constant, and that the rate of precession of the body axis about 
the space axis z is £2 S = L/X l as in (10.96). (d) Using (10.99) show that the angle between co and e 3 is 
constant and that the three vectors L, co, and e 3 are always coplanar. 

10.56 ★** An important special case of the motion of a symmetric top occurs when it spins about a 
vertical axis. Analyze this motion as follows: (a) By inspecting the effective PE (10.114), show that 
if at any time 0 = 0, then L 3 and L z must be equal, (b) Set L z = L 3 = k 3 a> 3 and then make a Taylor 
expansion of [7 eff (0) about 9 = 0 to terms of order 9 2 . (c) Show that if co 3 > co min = 2.^ MgRkJk 2 , 
then the position 9 = 0 is stable, but if a> 3 < <z> min it is unstable. (In practice, friction slows the top’s 
spinning. Thus with co 3 sufficiently fast, the vertical top is stable, but as it slows down the top will 
eventually lurch away from the vertical when co 3 reaches co min .) 

10.57 *** (a) Find the Lagrangian for a symmetric top whose tip is free to slide on a frictionless, 
horizontal table. For generalized coordinates use the Euler angles (9, 0,0) plus X and Y, where 
(X, Y, Z ) is the CM position relative to a fixed point on the table. (Note that the vertical position Z is 
not an independent coordinate, since Z = R cos 9.) (b) Show that the CM motion of ( X , T) separates 
completely from the rotational motion, (c) Consider the two possible rates of steady precession (10.111) 
and (10.112) (for given 9 and co 3 ). How do these differ in the present case from their corresponding 
values when the tip is held at a fixed pivot? 




CHAPTER 


Coupled Oscillators and 
Normal Modes 


In Chapter 5 we discussed the oscillations of a single body, such as a mass on the 
end of a fixed spring. I now want to take up the oscillations of several bodies, such as 
the atoms that make up a molecule like C0 2 , which we can imagine as a system of 
masses connected to one another by springs. If each mass were attached to a separate 
fixed spring, with no connections between the masses, then each would oscillate 
independently, as described in Chapter 5, and there would be nothing more to say. 
Thus our interest here is a system of masses that can oscillate and are connected to 
one another in some way — a system of coupled oscillators. A single oscillator has 
a single natural frequency, at which (in the absence of damping or driving forces) it 
will oscillate for ever. We shall find that two or more coupled oscillators have several 
natural (or “normal”) frequencies and that the general motion is a combination of 
vibrations at all the different natural frequencies. 

Like the theory of rotating bodies in Chapter 10, the theory of coupled oscillators 
makes essential use of matrices, and many of the ideas you learned in Chapter 10 play 
an important role here. The most obvious applications of the ideas of this chapter are to 
the study of molecules, but there are many others, including acoustics, the vibrations 
of structures like bridges and buildings, and coupled electrical circuits. 

Throughout this chapter I shall assume that all of the forces with which we are 
concerned obey Hooke’s law and hence that the equations of motion are all linear. 
While this is a special case, it is a very important special case, with many applications 
in mechanics and throughout physics. Nevertheless, you should bear in mind that the 
systems discussed here are a special case; we shall see in Chapter 12 how startlingly 
more complicated the motion of nonlinear oscillators can be. 


11.1 Two Masses and Three Springs 


As a simple first example of coupled oscillators, consider the two carts shown in 
Figure 11.1. The carts move without friction on a horizontal track, between two fixed 
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Figure 11.1 Two carts attached to fixed walls by the springs labeled k ] and 
k 3 , and to each other by k 2 -The carts’ positions x 1 and x 2 are measured from 
their respective equilibrium positions. 


walls. Each is attached to its adjacent wall by a spring (force constants k] and k 3 ), and 
the carts are attached to each other by a spring with force constant k 2 ■ In the absence 
of the spring 2, the two carts would oscillate independently of each other. Thus it is 
spring 2 that “couples” the two oscillators. In fact, spring 2 makes it impossible that 
either cart move without the other moving as well: For example, if cart 1 is stationary 
and cart 2 moves, the length of spring 2 will change, which will produce a changing 
force on cart 1, causing it to move as well. 1 

It is easy to find the equations of motion for the two carts, using either Newton’s 
second law or Lagrange’s equations. In general, Lagrange’s equations are easier 
to write down, but in the present simple case, Newton’s law may be a little more 
instructive. Suppose the two carts have moved distances x l and x 2 (measured to the 
right) from their equilibrium positions. Spring 1 is now stretched by an amount x l and 
so exerts a force k^x j to the left on cart 1. Spring 2 is more complicated since it is 
affected by the positions of both carts, but you can easily convince yourself that it is 
stretched by the amount x 2 — x { and exerts a force k 2 (x 2 — jq) to the right on cart 1. 
Thus the net force on cart 1 is 

(net force on cart 1) = — k x X\ + k 2 (x 2 — Xj) 

- ~(k 1 +k 2 )x l +k 2 x 2 ( 11 . 1 ) 

where the second line is just to show more clearly the dependence on the two variables 
Xj and x 2 . You can find the net force on cart 2 in the same way, and the two equations 
of motion are 

m \*\ = -(*i + h)*\ + k 2 x 2 ) (1 j 2 ) 

m 2 x 2 = k 2 x i - ( k 2 + k 3 )x 2 . J 

Before we try to solve these two coupled equations, notice that they can be written 
in the beautifully compact matrix form 

Mx = —Kx. (11.3) 


1 In the following discussion, it is simplest to assume that when the two carts are at their 
equilibrium positions the three springs are neither stretched nor compressed. (Their lengths are 
equal to their natural, unstretched lengths.) However, depending on the distance between the two 
walls, it could be that all three springs are compressed or all three are stretched. Fortunately, as you 
can easily check (Problem 11.1), none of the results of the next three sections is affected by these 
possibilities. 
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Here I have introduced the (2 x 1) column matrix (or “column vector”) 

(11.4) 

which labels the configuration of our system. (It has 2 elements because the system 
has 2 degrees of freedom; for a system with n degrees of freedom it would have n 
elements.) I have also defined two square matrices, 


\ m ' °' 

and 

K _ r k \+ k 2 

i 

|_ 0 m 2 


L ~ k 2 

*2 + ^J 


The “mass matrix” M is (in this simple case, at least) a diagonal matrix, with the 
masses m x and m 2 down the diagonal. The “spring-constant matrix” K has nonzero 
off-diagonal elements, reflecting that the right sides of the two equations (11.2) couple 
x 1 and x 2 . Notice that the matrix equation (11.3) is a very natural generalization of 
the equation of motion of a single cart on a single spring: With just one degree of 
freedom, all three matrices x, M, and K are just (lxl) matrices, that is, ordinary 
numbers. The configuration x is the cart’s position x, the mass matrix M is the cart’s 
mass m, and K is the spring constant k. And the equation of motion (11.3) is just the 
familiar mx = —kx. Notice also that both matrices M and K are symmetric, as will 
be true of all the corresponding matrices in this chapter. Although the symmetry of 
M and K does not play a very obvious role in the discussions here, it is in fact a key 
property of the underlying mathematics, as we shall see in the appendix. 

In trying to solve the equation of motion (11.3) we might reasonably guess that 
there could be solutions in which both carts oscillate sinusoidally with the same 
angular frequency co ; that is, 



and 


Xi(t) = cos (cot — 5j) 

(11.6) 

X 2 {t) = G! 2 COS (cot — S 2 ). 

(11.7) 


In any event, there is nothing to stop us trying to find solutions of this form. (And we 
shall, in fact, succeed!) If there is a solution of this form, then there will certainly also 
be a solution of the same form but with the cosines replaced by sines: 


^(r) = a 1 sin (cot — <5j) 
and 

y 2 {t ) = a 2 sin (cat — S 2 ) 

and there is nothing to stop me combining these two solutions into a single complex 
solution 


z,(r) = xj(0 + iy,(*) = a 1 e' (<w, "* l) = = a x e ib3t , (11.8) 


where a x — a x e lSi , and, likewise, 

z 2 {t) = x 2 (t) + iy 2 (t) = = a 2 <r‘'= a 2 e iM . 


(11.9) 
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This trick of introducing a “fictitious” complex solution to the equation of motion 
is the same trick introduced in Section 5.5.1 am not, of course, claiming that these 
complex numbers represent the actual motion of the two carts. The actual motion is 
given by the two real numbers (11.6) and (11.7). Nevertheless, for the right choices 
of a h a 2 , and co, the two complex numbers (11.8) and (11.9) are (as we shall see) 
solutions of the equation of motion, and their real parts describe the actual motion 
of our system. The great advantage of the complex numbers is that, as you can see 
from the right sides of (11.8) and (11.9), they both have the same time dependence, 
given by the common factor e lcot . This lets us combine the two complex solutions into 
a single (2 x 1) matrix solution of the form 



where the column a is a constant, made up of two complex numbers, 


"*l" 

_r 

_ a 2_ 

|_ a 2 e~ i&2 _ 


In seeking solutions of the equation of motion (11.3), we shall accordingly try for 
solutions z(t) of the complex form (11.10), bearing in mind that when we find such 
solutions the actual motion x(r) is equal to the real part of z (t). 


x(t) = Rez(r). 


When we substitute the form (11.10) into Equation (11.3), Mx = -Kx, we obtain 
the equation 


—w 2 Ma e lcot = —Kae ia)t , 

or, cancelling the common exponential factor and rearranging, 


(K - &> 2 M)a = 0. (11.11) 


This equation is a generalization of the eigenvalue equation studied in Section 10.5. 
(In the usual eigenvalue equation, what we are calling co 2 is the eigenvalue, and 
where we have the matrix M the ordinary eigenvalue equation has the unit matrix 
1.) It can be solved in almost exactly the same way. If the matrix (K — &> 2 M) has 
nonzero determinant, then the only solution of (11.11) is the trivial solution a = 0, 
corresponding to no motion at all. On the other hand, if 


det(K - o> 2 M) = 0, 


(11.12) 
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then there certainly is a nontrivial solution of (11.11) and hence a solution of the 
equations of motion with our assumed sinusoidal form (11.10). In the present case, 
the matrices K and M are (2 x 2) matrices, so the equation (11.12) is a quadratic 
equation for co 2 and has (in general) two solutions for co 2 . This implies that there are 
two frequencies co at which the carts can oscillate in pure sinusoidal motion as in 
(11.10) [or, rather, (11.6) and (11.7) for the actual real motion]. 2 

The two frequencies at which our system can oscillate sinusoidally (the so-called 
normal frequencies) are determined by the quadratic equation (11.12) for co 2 . The 
details of this equation depend on the values of the three spring constants and the 
two masses. While the general case is perfectly straightforward, it is not especially 
illuminating, and I shall discuss instead two special cases where one can understand 
more easily what is going on. I shall start with the case that the three springs are 
identical, and likewise the two masses. 


11.2 Identical Springs and Equal Masses 


Let us continue to examine the two carts of Figure 11.1, but suppose now that 
the two masses are equal, m, = m 2 = m, and similarly the three spring constants, 
= k 2 = k 2 = k. In this case, the matrices M and K defined in (11.5) reduce to 


!~ m 0 

and 

„ [2k -kl 

K= , 

|_ 0 m _ 


—k 2k 


The matrix (K — co 2 M) of the generalized 3 eigenvalue equation (11.11) becomes 


(K - co 2 M) = 


[ 


2k — moo 2 
-k 


-k 

2k — moo 2 




(11.14) 


and its determinant is 


det(K — oo 2 M) = (2k — moo 2 ) 2 — k 2 = (k — mco 2 )(3k — moo 2 ). 


The two normal freqencies are determined by the condition that this determinant be 
zero and are therefore 



These two normal frequencies are the frequencies at which our two carts can oscillate 
in purely sinusoidal motion. Notice that the first one, co h is precisely the frequency 
of a single mass m on a single spring k. We shall see the reason for this apparent 
coincidence in a moment. 


2 Since there are two solutions for co 2 , you might think this would give four solutions for 
co — ±Vo4. However, a glance at Equations (11.6) and (11.7) will convince you that +oj and —co 
constitute the same frequency for the real motion. 

3 From now on, I shall refer to (11.11) as the eigenvalue equation, omitting the “generalized.” 
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Equation (11.15) tells us the two possible frequencies of our system, but we have 
not yet described the corresponding motions. Recall that the actual motion is given by 
the column of real numbers x(t) — Rez(?) where the complex column z(t) = ae lcot , 
and a is made up of two fixed numbers, 

which must satisfy the eigenvalue equation 

(K — a> 2 M)a = 0. (11.16) 

Now that we know the possible normal frequencies, we must solve this equation for 
the vector a for each normal frequency in turn. The sinusoidal motion with any one of 
the normal frequencies is called a normal mode, and I shall start with the first normal 
mode. 


The First Normal Mode 

If we choose co equal to the first normal frequency, oj 1 = y/k/m, then the matrix 
(K - (o 2 M) of (11.14) becomes 

(K — o)j 2 M) = ^ “*]. (11.17) 

(Notice that this matrix has determinant 0, as it should.) Therefore, for this case, the 
eigenvalue equation (11.16) reads 



which is equivalent to the two equations 

a x — a 2 = 0 
—a x + a 2 = 0. 

Notice that these two equations are actually the same equation, and either one implies 
that a l — a 2 — Ae~ lS , say. The complex column z (t) is therefore 

*>-[£]**=[a] 4 ** 

and the corresponding actual motion is given by the real column x(t) = Re z(t) or 
X<,) = [*2(0 ] = [ ^ ] C ° S(aV ~ ^ ' 

That is, 

~ A cos(ai x t 5) | [first normal mode]. (11.18) 
x 2 (t) = A COS ( 00 yt - 8) J 
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First mode 



Figure 11.2 The first normal mode for two equal-mass carts with three 
identical springs. The two carts oscillate back and forth with equal 
amplitudes and exactly in phase, so that x x (t) = x 2 (t), and the middle 
spring remains at its equilibrium length all the time. 



Figure 11.3 In the first mode, the two positions oscillate sinusoidally, 
with equal ampitudes and in phase. 


We see that in the first normal mode the two carts oscillate in phase and with the same 
amplitude A, as shown in Figure 11.2. 

A striking feature of Figure 11.2 is that, because Xi(t) = x 2 (t), the middle spring 
is neither stretched nor compressed during the oscillations. This means that, for the 
first normal mode, the middle spring is actually irrelevant, and each cart oscillates just 
as if it were attached to a single spring. This explains why the first normal frequency 
o)\ = y/k/m is the same as for a single cart on a single spring. 

Another way to illustrate the motion in the first normal mode is just to plot the two 
positions Xj and x 2 as functions of t. This is shown in Figure 11.3. 


The Second Normal Mode 


The second normal frequency at which our system can oscillate sinusoidally is given 
by (11.15) as co 2 = s/3k/m, which, when substituted into (11.14), gives 

(K-® 2 2 M)=[“* “*j. (11.19) 

Thus, for this normal mode, the eigenvalue equation (K - cu 2 2 M)a — 0 implies that 
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Second mode 



Figure 11.4 The second normal mode for two equal-mass carts with 
three identical springs. The two carts oscillate back and forth with equal 
amplitudes but exactly out of phase, so that x 2 (t) = —x x (t) at all times. 



Figure 11.5 In the second mode, the two positions oscillate sinu¬ 
soidally, with equal ampitudes but exactly out of phase. 


which implies that a x + a 2 = 0, or a x = —a 2 = Ae lS , say. The complex column z(t) 
is therefore 




and the corresponding actual motion is given by the real column x(t) = Re z(t) or 


x< T=[^)] = [-A] C0S( “ 2t - 5) - 


That is, 


Acos(a) 2 t S) | [second normal mode], (11.20) 
x 2 (t ) = -Acos(a) 2 t - S)} 

We see that in the second normal mode the two carts oscillate with the same amplitude 
A but exactly out of phase, as shown in the picture of Figure 11.4 and the graphs of 
Figure 11.5. 

Notice that in the second normal mode, when cart 1 is displaced to the right, cart 
2 is displaced an equal distance to the left, and vice versa. This means that when the 
outer two springs are stretched (as in Figure 11.4), the middle spring is compressed 
by twice as much. Thus, for example, when the left spring is pulling cart 1 to the left, 
the middle spring is pushing cart 1, also to the left, with a force that is twice as large. 
This means that each cart moves as if it were attached to a single spring with force 
constant 3k. In particular, the second normal frequency is co 2 = yJ3kjm. 
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The General Solution 

We have now found two normal-mode solutions, which we can rewrite as 

x(t ) = Aj | j cos iwyt - 8 i) and x(t) = A 2 ^ j cos (co 2 t - 8 2 ) 

where co l and <o 2 are the normal frequencies (11.15). Both of these solutions satisfy the 
equation of motion Mx = —Kx for any values of the four real constants A,, 8 U A 2 , 
and S 2 . Because the equation of motion is linear and homogeneous, the sum of these 
two solutions is also a solution: 

x(0 = Aj j j cosico^ - <5j) + A 2 j^_ 1 1 jcos(w 2 t - S 2 ). (11.21) 

Because the equation of motion is really two second-order differential equations for 
the two variables x x (t) and x 2 (t), its general solution has four constants of integration. 
Therefore the solution (11.21), with its four arbitrary constants, is in fact the general 
solution. Any solution can be written in the form (11.21), with the constants A h A 2 , <5, , 
and 8 2 determined by the initial conditions. 

The general solution (11.21) is hard to visualize and describe. The motion of each 
cart is a mixture of the two frequencies, co ] and co 2 . Since co 2 = the motion 
never repeats itself, except in the special case that one of the constants A, or A 2 is 
zero (which gives us back one of the normal modes). Figure 11.6 shows graphs of 
the two positions in a typical nonnormal mode (with Aj — 1, A 2 = 0.7, 5, = 0, and 
S 2 — n/2). About the only simple thing one can say about these graphs is that they 
certainly are not very simple! 



Figure 11.6 In the general solution, both x,( t) and x 2 (t) oscillate with 
both of the normal frequencies, producing a quite complicated non¬ 
periodic motion. 


Normal Coordinates 

We have seen that in any possible motion of our two-cart system, both of the co¬ 
ordinates Xj(t) and x 2 (t) vary with time. In the normal modes, their time depen¬ 
dence is simple (sinusoidal), but it is still true that both vary, reflecting that the two 
carts are coupled and that one cart cannot move without the other. It is possible 
to introduce alternative, so-called normal coordinates which, although less physi¬ 
cally transparent, have the convenient property that each can vary independently of 
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the other. This statement is true for any system of coupled oscillators, but is espe¬ 
cially easy to see in the present case of two equal masses joined by three identical 
springs. 

In place of the coordinates x 1 and x 2 , we can characterize the positions of the two 
carts by the two normal coordinates 

^ = \{x x + x 2 ) (11.22) 

and 

£ 2 = : i(*,-* 2 ). (11.23) 

The physical significance of the original variables x x and x 2 (as the positions of the 
two carts) is obviously more transparent, but 4 and £ 2 serve just as well to label the 
configuration of the system. Moreover, if you refer back to (11.18) for the first normal 
mode, you will see that in the first mode the new variables are given by 

^ T (t) Acos(aqt <5)1 [first normal mode], (11.24) 

£ 2(0 — ( ~) J 

whereas in the second mode, we see from (11.20) that 

0 .. 1 [second normal mode]. (11.25) 

4(0 = A cos (eo 2 t - 8) j 

In the first normal mode the new variable 4 oscillates, but 4 remains zero. In 
the second mode it is the other way round. In this sense, the new coordinates are 
independent — either can oscillate without the other. The general motion of our 
system is a superposition of both modes, and in this case both £, and £ 2 oscillate, 
but £j oscillates at the frequency co 1 only, and £ 2 at the frequency co 2 only. In some 
more complicated problems, these new normal coordinates represent a considerable 
simplification. (See Problems 11.9, 11.10, and 11.11 for some examples and Section 
11.7 for further discussion.) 


11.3 Two Weakly Coupled Oscillators 


In the last section we discussed the oscillations of two equal masses joined by three 
equal springs. For this system, the two normal modes were easy to understand and to 
visualize, but the nonnormal oscillations were much less so. A system where some of 
the nonnormal oscillations are readily visualized is a pair of oscillators which have 
the same natural frequency and which are weakly coupled. As an example of such 
a system, consider the two identical carts shown in Figure 11.7, which are attached 
to their adjacent walls by identical springs (force constants k) and to each other by a 
much weaker spring (force constant k 2 <g.k). 

We can quickly solve for the normal modes of this system. The mass matrix M 
is the same as before. The spring matrix K and the crucial combination (K — <n 2 M) 
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x \ x 2 


Figure 11.7 Two weakly coupled carts. The middle spring which couples the 
two carts is much weaker than the outer two springs. 


that determines the eigenvalue problem are easily written down [starting with (11.5) 
forK]: 


and 


(K - o/M) = 


f" k + k 2 —k 7 1 

L k 2 k + k 2 J 

[ k + k 2 — mco 2 —k 2 

—k 2 k + k 2 — mco 2 


(11.26) 


The determinant of (K — co 2 M) is (k — mco 2 )(k + 2 k 2 — mco 2 ), and we conclude that 
the two normal frequencies are 


“ 1 = v^ and “ 2= / 


k + 2 k 2 


(11.27) 


The first frequency is exactly the same as in the previous example, and we can 
easily see why. The motion in this first mode is, as you can check, the same motion as 
shown in Figure 11.2 for the first mode of the equal-spring case. The important point 
is that in this mode the two carts move together in such a way that the middle spring is 
undisturbed and hence irrelevant. Naturally we get the same frequency for this mode 
whatever the strength of the middle spring. 

In the second mode also, the motion is the same as for the corresponding mode 
of the equal-spring example — namely, the two carts oscillating exactly out of phase, 
both moving inward or both moving outward at any one time, as in Figure 11.4. But 
in this mode, the strength of the middle spring is, of course, relevant, and the second 
normal frequency co 2 as given by (11.27) depends on k 2 . In the present case, co 2 is 
very close to co h since k 2 <£ k. To take advantage of this closeness, it is convenient to 
define co 0 to be the average of the two normal frequencies 

co 1 + co 2 
co n = -. 

2 

Since oj, and co 2 are very close to each other, co 0 is very close to either, and for most 
purposes we can think of co Q as essentially the same as co x = «Jkjm. To show the small 
difference between co l and co 2 ,1 shall write 


co x = co Q — e and co 2 = co 0 + e. 

That is, the small number € is half the difference between the two normal frequencies. 
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The two normal modes of the weakly coupled carts can now be written as 

z (t) = C,[;] e^°- €)t and z (r) = C 2 [ J e^° +€)t . 

Both of these satisfy the equation of motion for any values of the two complex numbers 
Cj and C 2 . (It is convenient to continue to work with the “fictitious” complex solutions 
for a bit longer.) The sum of these two solutions is also a solution, 

z(f) = Cj |J j + C 2 J a i{u> ° +e)t , (11.28) 

and, since it contains four arbitrary real constants (the two complex constants C 1 and 
C 2 are equivalent to four real constants), it is the general solution. The constants C } and 
C 2 in (11.28) are determined by the initial conditions — the positions and velocities 
of the two carts at t = 0. 

To see some general features of the solution (11.28), it is helpful to factor it as 



This expresses our solution as a product of two terms. The term in braces, {• • •}, is a 
(2 x 1) column matrix which depends on t. But because e is very small, this column 
varies very slowly compared to the second factor e ,c ° ot . Over any reasonably short 
time interval, the first factor is essentially constant and our solution behaves like 
z(t) = ae ic ° ot , with a constant. That is, over any short time interval, the two carts 
will oscillate sinusoidally with angular frequency co 0 . But if we wait long enough, the 
“constant” a will vary slowly, and the details of the two carts’ motion will change. I 
shall illustrate this behavior in detail in a moment. 

Let us now examine the behavior of (11.29) for some simple values of the constants 
Cj and C 2 . First, if either C, or C 2 is zero, the solution (11.29) reverts to one of the 
normal modes. (For instance, if C, = 0, the solution is the second normal mode.) A 
more interesting case is that Cj and C 2 are equal in magnitude, and, to simplify the 
discussion, I shall suppose C, and C 2 are equal and real, 


Ci = C 2 =. A/2, 


say. (The 2 is just for future convenience.) In this case (11.29) becomes 


a r 

t + e i€t 

1 

, . coset 1 

zl* 

l _ e l€t 

r' 

3 = A 

—i sin et J 


(11.30) 


To find the actual motion of the two carts, we must take the real part of this matrix, 
x(f) = Re z (t), whose two elements are the two positions. 


x,(r) = A coset cos 0 ) o t 1 (1131) 

x 2 (t) = A sinet sinm 0 t. j 

The solution (11.31) has a simple and elegant interpretation. Notice first that at time 
zero, jcj = A, whereas x ] = x 2 = x 2 = 0. That is, our solution describes the motion 
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when cart 1 is pulled a distance A to the right and released at t = 0, with cart 2 
stationary at its equilibrium position. Because e is very small, there is an appreciable 
interval (namely, 0 < t <3C 1/e) during which the functions in (11.31) that involve et 
remain essentially unchanged, that is, coset M 1 and sin et ~ 0. During this initial 
interval, the two positions, as given by (11.31), are just 


x x (t) ~ A cos co 0 t 
x 2 (t) % 0 


(11.32) 


Initially, cart 1 oscillates with amplitude A and frequency &) 0 , while cart 2 remains 
stationary. 

This simple state of affairs cannot last for ever. As soon as cart 1 begins to move, it 
starts to flex the weak middle spring, which starts to push and pull on cart 2. Although 
the force exerted by the middle spring is weak, it eventually starts to make cart 2 
oscillate. This can be seen in (11.31), where the factor sine? eventually becomes 
appreciable, and cart 2 starts to oscillate, also at the frequency co 0 . Notice that as the 
factor sin et in x 2 (t) grows toward 1, the factor cos et in x x (t) shrinks toward zero, as 
it has to do to keep the total energy of the two oscillating carts constant. Eventually, 
when t = it/2e, the factor sin et reaches 1 (and cos et reaches zero), and there is an 
interval when 4 


x x (t) ^ 0 
x 2 (t) & A sincu 0 r 


[t & n/2e\. 


(11.33) 


Now that cart 2 is oscillating at maximum amplitude and cart 1 not at all, cart 
2 starts to drive cart 1. Cart 1 begins to oscillate with increasing amplitude, and the 
amplitude of cart 2’s oscillations begins to diminish again. This process, in which the 
two carts pass the energy back and forth from one to the other, continues indefinitely 
(or until dissipative forces — which we are ignoring — have removed all the energy). 
It is illustrated in Figure 11.8, which shows x x (t) and x 2 (t) as given by (11.31) as 
functions of t for a couple of cycles of passing the energy from cart 1 to 2 and back 
to 1 again. 

If you have studied the phenomenon of beats, you have probably noticed the simi¬ 
larity of either graph in Figure 11.8 to a graph of beats. Beats are the result of superpos¬ 
ing two waves — sound waves, for example — with nearly equal frequencies. Because 
of the small difference in frequencies, the two waves move regularly in and out of 
phase (at any one location). This means that the resulting interference of the waves is 
alternately constructive and destructive, and a graph of the resultant signal looks just 
like either one of the graphs in Figure 11.8. To understand what is beating in the case 
of our two carts, we need to consider again the two normal coordinates of Equations 
(11.22) and (11.23), g x = ±(x x + x 2 ) and £ 2 = j(xj — x 2 ). For the present solution 


4 Notice that we can interpret what has happened in terms of the discussion of Section 5.6 on 
resonance. The weak spring has been driving cart 2 at the resonant frequency co 0 , so cart 2 should 
have responded by oscillating tt/ 2 behind the driver. This is exactly what we see from (11.32) and 
(11.33), since sin a> 0 t is indeed n/2 behind cos co 0 t. 
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Figure 11.8 The positions x x (t) and x 2 (t) of two weakly coupled oscillating 
carts if cart 1 is released from rest at x l = A > 0 and cart 2 at x 2 = 0. 


(11.31) these are [remember the trig identity cosdcosrp + sin 6 sirup = cos(0 — <p )] 

%(t) = iAcos(cu 0 - €)t = lAcoscujtl (1134) 

£ 2 (0 = tAcos(&> 0 + €)t = \AcOSO) 2 t. J 

That is, the two normal coordinates oscillate with equal amplitudes, the first at the 
frequency co x and the second at the nearby frequency co 2 . Since x x {t) = ^(t) + C 2 (t), 
we see that x x (t) is the superposition of ^ { (t) and £ 2 (t), and the waxing and waning 
of x x (t) is the result of beats between these two signals of nearly equal frequencies. 
The same applies to x 2 (t), except that, because x 2 (t) = Q\(t) — £ 2 (t), the moments of 
constructive interference for x x (t) are moments of destructive interference for x 2 (t) 
and vice versa, as is clearly seen in Figure 11.8. 


11.4 Lagrangian Approach: The Double Pendulum 


The analysis of the two oscillating carts in the last three sections was based on 
Newton’s second law. We could equally have derived the equations of motion using the 
Lagrangian formalism, although there is no particular advantage to doing so. However, 
we shall find that as we study systems of increasing complexity, the advantages of 
the Lagrangian approach rapidly become overwhelming. I shall start this section by 
rederiving the equations for the familiar two carts from their Lagrangian. I shall then 
do the same for another simple system with two degrees of freedom, the double 
pendulum. These two examples will pave the way for the general discussion in the 
next section. 


Lagrangian Approach for Two Carts on Three Springs 

Let us consider once more the two carts of Figure 11.1. We could write down the 
equations of motion (11.2) from Newton’s second law as soon as we had identified 
the forces on each of the carts. To do the same thing with Lagrange’s equations, we 
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have first to write down the kinetic and potential energies, T and U, and then the 
Lagrangian £ — T - U. The kinetic energy is just 

T = \m x x 2 + \m 2 x 2 . (11.35) 

To write down the potential energy, we must identify the extensions of the three 
springs as x h x 2 — x h and — x 2 , from which it immmediately follows that the potential 
energy is 

U = \k { x 2 + \k 2 {x x - x 2 ) 2 + \k 2 x 2 

= \(k { + k 2 )x 2 - k 2 x x x 2 + \(k 2 + k 3 )x 2 2 . (11.36) 

These results immediately give us the Lagrangian £ = T - U and thence the two 
Lagrange equations of motion: 

9£ _ 3£ 
clt dx x 3xj 

and 

d 3£ _ 3£ 
dt dx 2 dx 2 

These are precisely the two equations of motion (11.2), which we rewrote in the 
compact matrix form as Mx = — Kx. This alternative derivation of the same equations 
has no particular advantage for this simple system. Here is a second system, which 
is still very simple, but for which the Lagrangian approach is already distinctly more 
straightforward than the Newtonian. 


or m x x x = — (k x + k 2 )x x + k 2 x 2 


or m 2 x 2 — k 2 x x — (k 2 + k 3 )x 2 . 


The Double Pendulum 

Consider a double pendulum, comprising a mass m x suspended by a massless rod of 
length Lj from a fixed pivot, and a second mass m 2 suspended by a massless rod of 
length L 2 from m h as shown in Figure 11.9. It is a straightforward matter to write down 
the Lagrangian £ as a function of the two generalized coordinates 4>\ and 0 2 shown. 
When the angle (p x increases from 0, the mass m, rises by an amount L x ( 1 — cos 0() 
and gains a potential energy 


U 1 = m 1 gL l ( 1 - cos <p{). 

Similarly, as 0 2 increases from 0, the second mass rises by L 2 (l — cos 0 2 ) but, in 
addition, its point of support (m{) has risen by L,(l - cos (p x ). Thus 

U 2 = m 2 g[L x ( 1 — cos 0j) + L 2 (l — cos0 2 )]- 

The total potential energy is therefore 

£/(00 2 ) = ( m i + m 2 ) 8 L \(l - COS0!) + m 2 gL 2 ( 1 - cos0 2 ). (11.37) 



Chapter 11 Coupled Oscillators and Normal Modes 



Figure 11.9 A double pendulum. The velocity of m 2 is the vector sum 
of the two velocities shown, separated by an angle 0 2 — 0i- 


The velocity of m t is just L ,0, in the tangential direction, as shown in Figure 11.9, so 
its kinetic energy is 


7j = 

The velocity of m 2 is the vector sum of two velocities, as I have indicated in Figure 
1 1.9 — the velocity L 2 <p 2 of m 2 relative to its support m , plus the velocity of its 
support. The angle between these two velocities is (0 2 — (p x ), so the kinetic energy of 
m 2 is 


T 2 — j/W 2 [L^ 0 ^ -)- 2 L jL 2 0 j 0 2 cos( 0 j — 0 2 ) + L 2 0^] 
and the total kinetic energy is 

T = \{m x + m 2 )L^<p^ + m 2 L,L 2 0,0 2 cos(0, — 0 2 ) + ±m 2 L 2 <p 2 . (11.38) 

From (11.38) and (11.37) we can write down the Lagrangian £ = T — U and then 
the two Lagrange equations for 0, and 0 2 . However, the resulting equations are too 
complicated to be especially illuminating and can certainly not be solved analytically. 
This situation is reminiscent of the simple pendulum, whose equation of motion 
(L0 = —g sin0) is also unsolvable analytically, forcing us to solve it numerically 
or to make a suitable approximation. In that case, and in the present case, the simplest 
useful approximation is the small-angle approximation, which reduces the simple 
pendulum’s equation to the solvable L0 = —g<p. We are going to see that for almost 
all coupled oscillating systems, the exact equations are not analytically solvable, but 
that if we confine attention to small oscillations (certainly an important special case), 
the equations reduce to a standard form that is solvable. 

Returning to the equations of the double pendulum, let us assume that both angles 
0! and 0 2 and the corresponding velocities 0 T and 0 2 remain small at all times. This 
lets us simplify the expressions for T and U by Taylor expanding them and dropping 
all terms that are of third power or higher in the four small quantities. In (11.38) for 
the kinetic energy, the only term that needs attention is the middle one. Since the 
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factor cos (0i - 0 2 ) is aleady multiplied by the doubly small product 0j0 2 , we can 
approximate the cosine by 1, to give 

T = + m 2 )L+ m 2 L 1 L 2 0 1 0 2 + | m 2 L 2 0 2 . (11.39) 

In (11.37) for the potential energy, we must handle the cosines more carefully (since 
they are not already multiplied by any small quantities). The Taylor series for cos 0 
gives the approximation cos0 % 1 - 0 2 /2, which reduces (11.37) to 

U - 4 (m x + m 2 )gL 1 0 1 2 + jm 2 gL 2 <j>2 . (11.40) 

Before we use these simplified expressions for T and U to give us the equations 
of motion, let us pause to examine what our assumption of small oscillations has 
achieved. The exact expression (11.38) for T was a transcendental function of the 
coordinates 0 ls 0 2 and velocities 0 1; 0 2 ; the small-angle approximation reduced 
this to a homogeneous quadratic function 5 of the two velocities only. The exact 
expression (11.37) for U was a transcendental function of 0j and 0 2 ; the small- 
angle approximation reduced this to a homogeneous quadratic function of 0j and 
0 2 . We shall see that the same simplifications occur for a wide class of oscillating 
systems: The assumption that all oscillations are small reduces T to a homogeneous 
quadratic function of the velocities and U to a homogeneous quadratic function of the 
coordinates. 6 The simplifying feature of these homogeneous quadratic forms for T 
and U is that when we differentiate them to get Lagrange’s equations they reduce to 
homogeneous linear functions, yielding equations of motion that can always be easily 
solved. 

We can now substitute the approximate expressions (11.39) and (11.40) for T and 
U into the Lagrangian L — T — U and write down the two Lagrange equations of 
motion for 0, and 0 2 : 

d 3L 3£ , w 2 .. 

~T t TY ~ or (w! + m 2 )L 0 1 + m 2 L 1 L 2 0 2 = -(m 1 + m 2 )gL 1 0 1 (11.41) 

dt 00! 00! 

and 


d_ 3L _ 3£ 
dt 302 302 


or m 2 LiL 2 0i + m 2 L 2 0 2 = -m 2 gL 2 0 2 . (11.42) 


These two equations for 0i and 0 2 can be rewritten as a single matrix equation 


M0 = — K0 


(11.43) 


5 A homogeneous quadratic function contains only second powers of its arguments — no first 
powers or constant terms, and no powers higher than two. 

6 It is an almost unique feature of systems of masses connected by Hooke’s-law springs that the 
exact expressions for T and U [as in (11.35) and (11.36)] already are in these simple forms, without 
our having to make any approximations. 
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if we introduce the (2 x 1) column of coordinates 


0 = 



and the two (2 x 2) matrices 


1" (m l + m 2 )L 2 m 2 L 1 L 2 ’| 

and K =r (m ‘ + ^)* Ll 

0 1 

|_ m 2 L x L 2 m 2 L ^ J 

0 

m 28^ 2 \ 


(11.44) 


The matrix equation (11.43) is exactly analogous to (11.3) for the two carts on springs. 
In the present case, the “mass” matrix M is not actually made up of masses, but it still 
plays the role of inertia in the equation of motion (11.43). (That is, it multiplies the 
second derivatives of the coordinates.) Similarly the “spring-constant” matrix K is not 
actually made up of spring constants, but it plays the analogous role in the equation 
of motion. 

The procedure for solving the equations of motion (11.43) is exactly the same as 
for the two carts of Section 11.1. We first try to find solutions — normal modes — in 
which the two coordinates 0j and cp 2 vary sinusoidally with the same angular frequency 
co. Exactly as before, any such solution 0(0 can be written as the real part of a complex 
solution z(0 whose time dependence is just e imt ; that is, 

0(0 = Rez(0 where z(0 = &e lcot = j ^ 1 j e lcot , 

and the two components a 1? a 2 of a are constants. Exactly as before, a function of 
this form satisfies the equation of motion (11.43) if and only if the frequency co and 
the column a satisfy the eigenvalue equation (K — u> 2 M)a = 0. This equation for a 
has a solution if and only if det(K — &> 2 M) = 0, a quadratic equation for co 2 , which 
determines the two normal frequencies of the double pendulum. Knowing these two 
frequencies, we can go back and find the corresponding columns a, and we then know 
the two normal modes. Finally, the general motion of the system is just an arbitrary 
superposition of these two normal modes. 


Equal Lengths and Masses 

To simplify the discussion, let us now restrict our attention to the case that our double 
pendulum has equal masses, m l =m 2 = m, and equal lengths, L l = L 2 = L, say. The 
equations tidy up appreciably if we recognize that sJg/L is the frequency of a single 
pendulum of the same length L. If we call this frequency co 0 , then we can replace g 
everywhere by Leo 2 and the two matrices M and K of (11.44) become (as you can 
check) 


M = j] and K = mi 2 [ 2 "» 


(11.45) 
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The matrix (K - m 2 M) of the eigenvalue equation is therefore 

(K —'= (U.46) 

The normal frequencies are determined by the condition det(K — co 2 M) = 0, which 
gives 

2(col ~ " 2 ) 2 - = o) A - 4ojW + 2o) a 4 = 0 

with the two solutions co 2 = (2 ± y/2)o) 2 . That is, the two normal frequencies are 
given by 

co 2 = (2 - 42)co 2 and a) 2 = (2 + V2)co 2 (11.47) 

(or co l ~ 0.77aj o and co 2 ^ 1.85m 0 ) where co 0 = y/g/L is the frequency of a single 
pendulum of length L. 

Knowing the two normal frequencies we can now find the motion of the double pen¬ 
dulum in the corresponding normal modes, by solving the equation (K — o> 2 M)a = 0 
with o) — o) l and co 2 in turn. If we substitute oo = co h as given by (11.47), into (11.46), 
we get 

(K-<o?M) = mLW a (V2-l)[_ 2 y2 

Therefore, the equation (K — &> 2 M)a = 0 implies that a 2 = y/2a h and if we write 
a x — A x e~ lSl , the two coordinates are 

0(0 = j = Reae tft,1< = A x j cos(o) x t - [first mode]. (11.48) 

We see that in the first normal mode the two pendulums oscillate exactly in phase, 
with the amplitude of the lower pendulum 42 times that of the upper pendulum, as 
shown in Figure 11.10. 



Figure 11.10 The first normal mode for a double pendulum with equal 
masses and equal lengths. The two angles </>, and <p 2 oscillate in phase, 
with the amplitude for <p 2 larger by a factor of 42. 
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Figure 11.11 The second normal mode for a double pendulum with 
equal masses and equal lengths. The two angles 0, and <p 2 oscillate 
exactly out of phase, with the amplitude for 0 2 larger by a factor of a/2. 


Turning to the second mode, we find from (11.46) and (11.47) that 

(K-a> 2 2 M) = - m L 2 a, 2 (v5+l)[^ ^]. 

The equation (K — <z> 2 2 M)a = 0 implies that a 2 — and if we write a } = 

A 2 e~ l&2 , the two coordinates are 

0(f) = j = Re ae'" 2 ' = A 2 ^ ^ j cos (co 2 t - S 2 ) [second mode]. 

(11.49) 

In this second mode, 0 2 (f) oscillates exactly out of phase with <f> x (t), again with an 
amplitude that is a/ 2 times bigger than that of 0j(f), as shown in Figure 11.11. The 
general solution is an arbitrary linear combination of the two normal modes (11.48) 
and (11.49). 


11.5 The General Case 


We have now studied in great detail the normal modes of two systems — a pair of 
carts attached to three springs and a double pendulum — and are ready to discuss the 
general case of a system with n degrees of freedom that is oscillating about a point of 
stable equilibrium. Since the system has n degrees of freedom, its configuration can 
be specified by n generalized coordinates, 7 q x , ■ ■ -, q n . To avoid too much notational 
clutter, I shall now abbreviate the set of all n coordinates by a single boldface q, 

q = {q h • • •, q n ). 


1 1 take for granted that the system is holonomic, so that the number of degrees of freedom equals 
the number of generalized coordinates, as discussed in Section 7.3. 


A 


Section 11.5 The General Case 


437 


[Thus for the two carts of Section 11.1, q would denote the two displacements 
q = (x h x 2 ), and for the double pendulum, q = (<p h 0 2 ).] (Note well that q is not, 
in general, a three-dimensional vector; it is a vector in the n-dimensional space of the 
generalized coordinates q h ■ • •, q n .) 

I shall assume that the system is conservative, so that it has a potential energy 

U(q u ---,q n ) = U( q) 

and Lagrangian L = T — U. The kinetic energy is, of course, T = i m a*a 2 ’ where 

the sum runs over all the particles, a = 1, ■ • ■, N, that comprise the system. This has to 
be rewritten in terms of the generalized coordinates q = (q u ■ ■ ■ ,q n ) using the relation 
between the Cartesian coordinates r a and the generalized coordinates 

r a = r a (qi, ■ ■ ■, q n ) (11.50) 

where I shall take for granted that this relation does not involve the time t explicitly. 
[Recall that, in the terminology of Section 7.3, generalized coordinates for which 
(11.50) does not involve the time are called “natural.”] We saw in detail in Section 7.8 
that if we differentiate (11.50) with respect to t and substitute into the kinetic energy, 
we find that [compare Equation (7.94)] 

T = T(q,q) = (11.51) 

j,k 

where the coefficients A jk ( q) may depend on the coordinates q. [Compare (11.38) 
for the case of the double pendulum.] Under our present assumptions, the Lagrangian 
has the general form £(q, q) = T( q, q) — C/(q), where T{ q, q) is given by (11.51), 
and U (q) is an as-yet unspecified function of the coordinates q. 

Our. final assumption on the system is that it is making small oscillations about 
a configuration of stable equilibrium. By redefining the coordinates if necessary, we 
can arrange that the equilibrium position is q = 0 (that is, q x = • • • = q m m 0). Then, 
since we are interested only in small oscillations, we have only to concern ourselves 
with small values of the coordinates q, and we can use Taylor expansions of U and T 
about the equilibrium point q = 0. For U this gives 

l / (q) = u <p ) + Ef«> + lE i g^ + - t»-«) 

where all derivatives are evaluated at q = 0. This can be much simplified. First, since 
U (0) is a constant we can simply drop it, by redefining the zero of potential energy. 
Second, since q = 0 is an equilibrium point, all of the first derivatives dU/dqj are zero. 
I shall rename the second derivatives as d 2 U/dqjdq k — K jk (which satisfy K jk = K kj 
since it makes no difference in which order we evaluate second derivatives). Finally, 
since the oscillations are small, I shall neglect all terms higher than second order in 
the small quantities q or q. This reduces U to 

t/ = l/(q) = ft. 

M 


(11.53) 
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The kinetic energy is even simpler. Every term in (11.51) contains a factor qjq k which 
is already second order in small quantities. Therefore, we can ignore everything but the 
constant term in the expansion of Aj k { q). If we call this constant term A jk (Q) = M Jk , 
this reduces the kinetic energy to 

T = T(q) = iJ2Mj k 9 J q l (11.54) 

M 

and the Lagrangian to 


£(q,q) = r(q)-f/(q) (11.55) 

with T (q) given by (11.54) and U (q) by (11.53). Notice that the approximate forms 
(11.54) and (11.53) correspond to the approximations (11.39) and (11.40) for the dou¬ 
ble pendulum. Just like the latter approximations, they reduce the kinetic energy to 
a homogeneous quadratic function of the velocities q and the potential energy to a 
homogeneous quadratic function of the coordinates q. Just as with the double pendu¬ 
lum, this will guarantee that the equations of motion are solvable linear equations, but 
before we take up the equations, let’s see one more simple example of this dramatic 
simplification of T and U that results from the assumption of small oscillations. 

example ii.i A Bead on a Wire 

j A bead of mass m is threaded on a frictionless wire that lies in the xy plane 
1 (y vertically up), bent in the shape y = fix) with a minimum at the origin, as 
j shown in Figure 11.12. Write down the potential and kinetic energies and their 
j simplified forms appropriate for small oscillations about O. 

This system has just one degree of freedom, and the natural choice of 
) generalized coordinate is just x. With this choice, the potential energy is simply 
j U = mgy = mgf(x). When we confine ourselves to small oscillations, we can 
j Taylor expand fix). Since /(0) = /'(0) = 0, this gives 

| U(x) - mg f(x) \mgf'i 0)x 2 . 

The kinetic energy is T — {mix 2 + y 2 ), where, by the chain rule, y — f’ix)x. 
j Therefore, T = \m[ 1 + f'(x) 2 ]x 2 . Notice that the exact expression for T de¬ 
pends on x as well as x. However, since T already contains the factor x 2 , when 



Figure 11.12 A bead threaded on a frictionless 
wire in the shape y = fix). 
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we make the small-oscillation approximation we can simply replace the term 
fix) by its value at x = 0 (namely zero) and we get I 

T(x, x) = \m{\ + f'(x) 2 ]x 2 & jmx 2 . j 

As expected, the small-oscillation approximation has reduced both U and T to j 
homogeneous quadratic functions of x (for U) or i (for T). 


The Equation of Motion 


Returning to the approximate Lagrangian (11.55) for the general system, we can easily 
write down the equations of motion. Since there are n generalized coordinates q h 
(/ = !,•••, n), there are n corresponding equations 


d_d ,C _ ac 

dt dcfa dq { 




(11.56) 


To write these equations explicitly, we must differentiate the expressions (11.54) 
and (11.53) for T and U. If you have never tried differentiating sums like these, it 
may help to write them out explicitly at first. For example, for a system with just two 
degrees of freedom in = 2), Equation (11.53) for U reads 

2 

U — l T, Kjk qj Qk = \(K n q^ + K X2 q\q 2 + K 2l q 2 q x + K 22 q%) 

M=i 


= 5 (A:„9, 2 + 2K n q t q 2 + K 22 qf) (11.57) 

where the second line follows because K 12 = K 2X . In this form we can easily differ¬ 
entiate with respect to either q { or q 2 . For example, 

— = ^n<?i + K \ 2 q 2 

$q i 

with a corresponding expression for dU/dq 2 , and quite generally (however many 
degrees of freedom there are) 

- E K,jqj [i = l, •••,»]. (11.58) 

Since differentiation of the kinetic energy (11.54) works in exactly the same way, we 
can write down the n Lagrange equations (11.56): 

E"'*/= -!>««/ [i = l, •",«]. (11.59) 

j j 

These n equations can be immediately grouped into a single matrix equation 
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Mq = ~Kq 


( 11 . 60 ) 
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where q is the (n x 1) column 


q = 


qr 

-q n . 


and M and K are the (n x n) “mass” and “spring-constant” matrices comprised of 
the numbers M t j and K tj respectively. 8 

The matrix equation (11.60) is, of course, the n-dimensional equivalent of the two- 
dimensional equations (11.3) for the pair of carts and (11.43) for the double pendulum, 
and it is solved in exactly the same way. We first seek normal modes with the now- 
familiar form 


q(?)=Rez(t), where x(t) = ae l(0 ‘ (11.61) 

and a is a constant (n x 1) column. These lead us to the eigenvalue equation 


(K — <u 2 M)a = 0, (11.62) 

which has solutions if and only if co satisfies the characteristic or secular equation 


det(K - w 2 M) = 0. 


(11.63) 


This determinant is an nth degree polynomial in co 2 , so equation (11.63) has n 
solutions, which tell us the n normal frequencies of the system. 9 With co set equal 
to each of the normal frequencies in turn, Equation (11.62) determines the motion 
of the system in the corresponding normal mode. Finally, the general motion of the 
system is given by an arbitrary sum of the normal mode solutions (11.61). 

The general procedure outlined in the last three paragraphs is just what we have 
already discussed in detail for the examples of the two carts and the double pendulum. 
I shall go through one more example, this one with three degrees of freedom, in the 
next section, and you should certainly work some of the examples in the problems at 
the end of the chapter. 


8 From this point in the calculation all we need is the two matrices M and K. Thus, in practice, 
there is actually no need to write down the Lagrangian, nor the Lagrange equations, since the matrices 
M and K can be read off directly from the approximate expressions (11.54) and (11.53) for T and 
U. 

9 Two subtleties: First, it may happen that some of the roots of (11.63) are equal; this simply 
means that some of the normal modes have equal frequencies and presents no serious problem. 
Second — and much deeper— : we need the n solutions of (11.63) for or to be real and positive, 
in order that the normal frequencies be real. That this is actually so follows from properties of the 
matrices K and M as I shall discuss in the appendix. 
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11.6 Three Coupled Pendulums 


Consider three identical pendulums coupled by two identical springs as shown in 
Figure 11.13. As generalized coordinates it is natural to use the three angles shown 
as (f>i, 02’ and 03’ with the equilibrium position at 0i = 0 2 = 0 3 = 0. Our first task 
is to write down the Lagrangian for the system, at least for small displacements. 
The systematic, and perhaps the safest, procedure would be to write down the exact 
expressions for T and U and then make the small-angle approximations. In practice, 
finding the exact expressions can be very tedious. (In the present case, the potential 
energy of the springs depends on their extensions, and the exact expressions for these, 
good for any angles, are very cumbersome. See Problem 11.22.) It often happens that 
with care one can write down the small-angle approximations for T and U directly 
and save a lot of trouble, and this is what I shall do here. 

The kinetic energy of the three pendulums is easily seen to be 

T m, \mL 2 (<p 2 + 0 2 + <p 2 ) (11.64) 

which does not require any approximation. The gravitational potential energy of 
each pendulum has the form mgL( 1 - cos0) % \mgL(p 2 , where the last expression 
is the well-known small-angle approximation. Thus the total gravitational potential 
energy is 


I/grav = \mgL((p 2 + 0 2 2 + 0 3 2 ). (11.65) 

To find the potential energy of the two springs, we have to find how much each 
is stretched. For arbitrary values of the angles 0 this is a fairly messy affair, but for 
small angles the only appreciable stretching comes from the horizontal displacements 
of the pendulum bobs, each of which moves a distance of approximately L0 to the 



Figure 11.13 Three identical pendulums of lengths L and masses m 
are coupled by two identical springs with spring constants k. The gen¬ 
eralized coordinates are the three angles <p h </> 2 , and </> 3 . The springs’ 
natural lengths are equal to the separation of the supports of the pen¬ 
dulums, so the equilibrium position is 0i = 0 2 = 0 3 = 0, with all three 
pendulums hanging vertically. 
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right. Thus, for example, the left spring is stretched by about L(0 2 — 0i), and the total 
spring potential energy is 

U spt = \kl? [(0 2 - 0j ) 2 + (03 - 0 2 ) 2 ] 

= \kL 2 (4>t + 20 2 2 + 0 3 2 - 20!0 2 - 20203). (11-66) 

Before we combine these expressions for T and U to get the Lagrangian and the 
equations of motion, there is a useful device I would like to introduce. The equations 
we are going to derive involve several fixed parameters, (m, L, g, k), some of which 
are not especially interesting. The repeated writing of these parameters is, at the 
very least, an annoying chore and can easily lead to careless mistakes. Thus it is 
helpful to find some way to get rid of uninteresting parameters before we do any more 
calculation. A radical way to do this that is very popular with theoretical physicists is 
to choose a system of units such that the uninteresting parameters have the value 1 — 
a process sometimes described as choosing natural units. In the present problem, for 
example, we can choose m to be the unit of mass and L to be the unit of length. With 
this choice m and L naturally have the value 1, and they disappear from all subsequent 
work. This trick materially simplifies the trivial details of our calculations, reduces 
the danger of errors, and helps us to see the truly interesting features. 

The only serious disadvantage of using natural units is this: Once our calculations 
are complete, we sometimes want to know how our answers depend on the values of 
the parameters that have been suppressed. (What is the frequency of a certain normal 
mode, if L — 1.5 m?) To answer this kind of question, we have to put the banished 
parameters back into our answers. Although this process (of restoring the banished 
parameters) seems daunting to the beginner, it is usually fairly easy. For example, 
with m = L = 1, we are going to find that one of the normal frequencies is given by 
co 2 = g. This implies that (in our system of units) the quantity g/a> 2 has the value 1. 
But you will recognize that g/u> 2 has the dimensions of a length, and to say that a 
length has the value 1 in our units, is the same as saying that it has the value L in any 
system of units. Therefore, g/oo 2 — L in general, and our answer is that co 2 = g/L 
whatever units we use. This puts the “L” back into our answer and lets us find co for 
any given value of L. 

Let us then choose units with m = L = 1, so that the kinetic and potential energies 
obtained from (11.64), (11.65), and (11.66) become 

r = i(0 2 + 0 2 2 + 0 3 2 ) (11.67) 


and 


U = |g(0 2 + 0 2 2 + 0 3 2 ) + ^(0 ! 2 + 20 2 2 + 0 3 2 - 20!0 2 - 20203). (11.68) 

We could now write down the Lagrangian and then the equations of motion, but there 
is actually no need to do this. We already know that the result will be the now-familiar 
matrix equation 


M0 = -K0 


(11.69) 
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where, in this case, 0 is the (3 x 1) column comprising the three angles <p h (f> 2 , and 
0 3 . The elements of the (3 x 3) matrices M and K can be read directly from (11.67) 
and ( 11 . 68 ) to give 10 


"1 0 0 " 



' g + k 

-k 

0 " 

0 1 0 
_0 0 1 _ 

and 

K = 

-k 

0 

g + 2k 
-k 

-k 

g + k_ 


The normal modes of our system have the familiar form 0(r) = Re z (t) = Re ae ia)t , 
where a and co are determined by the eigenvalue equation 

(K — co 2 M)a = 0. (11.71) 


Our first step is to find the possible normal frequencies from the characteristic equation 
det(K — <w 2 M) = 0, to which end we need to write down the matrix (K — <u 2 M), 


(K - co 2 M) = 


g + k — co 2 —k 0 

—k g + 2k — co 2 —k 

0 — k g + k — co 2 J 


(11.72) 


The determinant is easily evaluated as 

det(K - &> 2 M) = (g - co 2 )(g + k - co 2 )(g + 3 k - co 2 ). 


so that the three normal frequencies are given by 

co 2 = g, a >2 = g + k, and co 2 — g + 3k. (11.73) 

Knowing the three normal frequencies, we can now find the three corresponding 
normal modes in turn. The first normal frequency has co x = *Jg. (This is in our units, 
where L — 1. As I have already mentioned, in arbitrary units it is co } = y/g/L, which 
is the frequency for a single pendulum of length L. We’ll see the reason for this 
coincidence in a moment.) If we substitute co x into Equation (11.72) for (K - oj 2 M), 
then the eigenvalue equation (11.71) implies (as you should check) that 

aj = a 2 = a 3 = Ae ~ lS , [first mode] 


say. That is, in the first mode, 

0i(O = 02(0 = 03(0 = A cos(c<V - 8) 

and the three pendulums oscillate in unison (equal amplitudes and phases), as shown 
in Figure 11.14(a). In this mode, the springs are neither compressed nor stretched, 
and their presence is irrelevant. Therefore, each pendulum oscillates just like a single 
pendulum, with frequency coi = sfg/L (or ^fg in our units). 


10 One has to think a little when reading off the elements of these matrices. The rule is this: If 
you ignore the factor | in front of (11.68), for example, then the diagonal element K u is just the 
coefficient of <p 2 , while the off-diagonal element K t j is half the coefficient of To understand 
this, look at (11.57). 
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(a) (b) (c) 



01 0 2 =0l 03 = 01 01 02=° 03 = _ 01 01 02 = _2 01 03 = 01 


Figure 11.14 The three normal modes for three coupled pendulums, (a) In 
the first mode, the three pendulums oscillate in unison. Since neither spring is 
stretched or compressed, the frequency is just that for a single pendulum of the 
same length, (b) In the second mode, the outer two pendulums oscillate exactly 
out of phase, while the middle one doesn’t move at all. (c) In the third mode, 
the outer two pendulums oscillate in unison, but the middle one is exactly out of 
phase and has twice the amplitude. 


If we substitute co = co 2 , the eigenvalue equation (11.71) implies that 

a x = — a 3 — Ae~ lS , but a 2 = 0 [second mode]. 

Therefore, the outer two pendulums oscillate exactly out of phase, while the middle 
one sits at rest, as shown in Figure 11.14(b). Finally, substituting co = co 3 into the 
eigenvalue equation (11.71) leads to the result 

a x = —\a 2 = a 3 = Ae~ lS , [third mode] 

say. Thus, in the third mode, the outer two pendulums oscillate in unison, but the 
middle one oscillates with twice the amplitude and exactly out of phase, as shown in 
Figure 11.14(c). The general solution is an arbitrary linear combination of all three 
normal modes. 


11.7 Normal Coordinates* 


* This section could be skipped if you are pressed for time. 

In Section 11.2, we found the normal modes for two equal masses joined by three 
identical springs. At the end of that section, I mentioned that one can replace the two 
coordinates x x and x 2 by two “normal coordinates” 

= 5<*1 + x 2 ) and £ 2 ~ \{x y - x 2 ). (11-74) 

These new coordinates have the property that each always oscillates at just one of the 
two normal frequencies — at the frequency oo x and at the frequency co 2 . In this 
section I will show that we can do the same thing for any system oscillating about a 
stable equilibrium point. If the system has n degrees of freedom, then it is described 
by n generalized coordinates q h ■ ■ ■ ,q n and has n normal modes with frequencies 
co h ■ ■ ■, co n . What I shall show is that we can introduce n new, normal, coordinates 
such that each normal coordinate q oscillates at just one frequency, namely 
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the normal frequency . Before I show this, I need to review and extend the discussion 
of Section 11.2 for the two carts. 


Normal Coordinates for Two Carts on Springs 

The equations of motion for the positions x l and x 2 of the two carts were given 
in (11.2), which I rewrite here (for the case of equal masses and equal spring con¬ 
stants) as 


mx'j — —2kx 1 + kx 2 } 

mx 2 = kx 1 — 2 kx 2 . J 


(11.75) 


A moment’s inspection should convince you that if we add these two equations, we 
get an equation fox i x = \ (xj + x 2 ) alone, and if we subtract them, we get an equation 
for i 2 = 5(^1 - * 2 ) alone: 


mil = ~H\ 1 

mi 2 = - 3 ki 2 . \ 


(11.76) 


These two equations are uncoupled, and show that each normal coordinate oscil¬ 
lates, as claimed, at a single frequency — i, at frequency co x = ffk/m and i 2 at 
co 2 = ff3k/m. In other words, the normal coordinates behave just like the coordi¬ 
nates of two uncoupled oscillators — by going over to the normal coordinates, we 
have “uncoupled” the oscillations. 

Just as the equations (11.75) for x x and x 2 can be rewritten as a single matrix 
equation Mx = — Kx, so the two equations (11.76) for i x and i 2 can be written as 
M'ij = — KTf, with the important difference that the two matrices M' and K' are both 
diagonal: 


[“ °1 

and 

K '=[; 

fc 0 ' 

[0 m\ 


L' 

3 3 k _ 


The transition from the original coordinates (x x , x 2 ) to the normal coordinates (£j, q 2 ) 
is said to diagonalize the matrices M and K. That the new matrices are diagonal 
is precisely equivalent to the statement that the equations (11.76) for i, and i 2 are 
uncoupled and that i x and i 2 oscillate independently. 

We can define the two normal coordinates i x and c 2 differently, and more generally, 
in terms of the eigenvectors a that describe the motion of the normal modes and are 
determined by the eigenvalue equation (K — co 2 M)a = 0. We saw in Section 11.2 that 
these two (2 x 1) columns are (for our two carts) 


a (D - 


[:] 



(11.78) 


[Two important points: Each of these vectors contains an arbitrary multiplier A, but 
I now want to fix this, and the simplest choice here is to make A = 1; another, and 
sometimes better, choice is to normalize the vectors by putting in a factor of l/\/2. 
Second, each column a is made up of two numbers, which I have been calling a { and 
a 2 . But I am now discussing two different columns, a n) and a (2) , one for each normal 
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mode, and I am using the parentheses in the subscripts to emphasize this distinction.] 
Now, it is easy to see that any (2 x 1) column vector can be written as a combination 
of the two vectors a (1) and a (2) . (See Problem 11.33.) In particular, I can expand the 
column x in this way as 

x = { 1 a (1) +{ 2 a (2) =^> + |j. (11.79) 

The first equality defines ^ and <; 2 as the coefficients in this expansion of x in terms of 
the eigenvectors a (1) and a (2) , but inspection of the last expression in (11.79) should 
convince you that it defines ^ and £ 2 to be precisely the normal coordinates of (11.74). 
That is, the normal coordinates, with the property that each oscillates independently 
at one of the normal frequencies, can be defined as the coefficients in the expansion of 
x in terms of the eigenvectors a (1) and a (2) . We shall see that this definition carries over 
naturally to the general case of oscillations of any system with n degrees of freedom. 


The General Case 

We can now easily introduce normal coordinates for an arbitrary oscillating system 
with n generalized coordinates q h ■ ■ •, q n . We know that such a system has n normal 
modes. In mode i, the column vector q oscillates sinusoidally, 


q(r) = a (j) cos {co^ — <5 ; ) 


where the fixed column a (/) satisfies the eigenvalue equation 

Ka (; - } = ffl ; 2 Ma (i) . (11.80) 

The columns a (1) , • • • , a (w) are n independent, real 11 (n x 1) columns, and any (n x 1) 
column can be expanded in terms of them; that is, the vectors a (1) , • • •, a (w) are a 
basis or complete set for the space of all (n x 1) vectors. (For a proof of these 
properties, see the appendix.) Thus any solution of the equations of motion q(t) can 
be expanded as 

q (0 = £&(»)•(,■)• (H.81) 

i = l 

This definition of the normal coordinates exactly parallels the new definition 
(11.79) for the system of two carts on springs, and we can now prove that it 
has the desired property that the different £,(?) oscillate independently, as fol¬ 
lows: 

The column q(t) satisfies the equation of motion 
Mq = -Kq. 


11 That the vectors a (i ) are real is not obvious. In the case of the two carts, you can see from (11.78) 
that they are. In the general case, they are determined by the real eigenvalue equation (11.80) and 
it is fairly easy to prove that either they are real or they can be redefined so that they are. See the 
appendix. 



Principal Definitions and Equations of Chapter 11 


447 


If we replace q by its expansion (11.81), this equation becomes 

jZ &W Ma (0« - J2 4(0 Ka (0 = -^4CO®, 2 Ma (i) (11.82) 

1 = 1 / = 1 1 = 1 

where the last equality follows from the eigenvalue equation (11.80). Now, the n 
column vectors a (1) , • • •, a (n) are independent, and this property is unchanged when 
they are multiplied by the matrix M. Therefore, the n vectors Ma (1) , • • •, Ma (fl) 
are also independent, and the equality (11.82) can only hold if all corresponding 
coefficients on each side are equal. 12 That is, 

1(0 = -ft# 4(0- 

This establishes that the normal coordinates 4 defined by (11.81) do indeed oscillate 
independently at the advertised frequencies. 


Principal Definitions and Equations of Chapter 11 _ 

The Equations of Motion in Matrix Form 

The configuration of a system with n degrees of freedom can be specified by an n x 1 
column matrix q, comprising the n generalized coordinates q h ■ ■ ■, q n . The equation 
of motion for small oscillations about a stable equilibrium (with the coordinates 
chosen so that q = 0 at equilibrium) has the matrix form 

Mq = —Kq [Eq. (11.60)] 

where M and K are the n x n “mass” and “spring-constant” matrices. One way to 
find these matrices is to write the KE and PE of the system in the forms 

T = | £ M jk <ij 4 and U = K Jk ( h ** [E( 1 S - ( 1L54 ) & (H-^ 3 )] 

j,k j,k 


Normal Modes 

A normal mode is any motion in which all n coordinates oscillate sinusoidally with 
the same frequency co (a normal frequency) and can be written as 

q(0=Re(ae ia *) [Eq. (11.61)] 

where the constant n x 1 column a must satisfy the generalized eigenvalue equation 

(K — cu 2 M)a = 0. [Eq. (11.62)] 

12 The result I am using here is the analog of the familiar result in three-dimensional space that 
the equality X ( e,- — Y?] ^ is only possible if for all i. For a proof of the independence 

of the vectors a (1) , • • •, a (n) see the appendix. 
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For any system with n degrees of freedom and a stable equilibrium at q = 0, there 
are n normal frequencies co u ■ ■ ■ ,co n (some of which may be equal) and n independent 
corresponding eigenvectors a (1) , ■ ■ •, a (n) . Any n x 1 column can be expanded in terms 
of these eigenvectors, and any solution of the equations of motion can be expanded 
in terms of the normal modes. 


Normal Coordinates 

When any solution is expanded in terms of the normal modes, the expansion coeffi¬ 
cients (t) are called normal coordinates, and each oscillates at the corresponding 
frequency co { . [Section 11.7] 


Problems for Chapter 11 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (+**). 

section 11.1 Two Masses and Three Springs 

11.1 * In discussing the two carts of Figure 11.1,1 mentioned that it is simplest to assume that when 
the two carts are in equilibrium the lengths L x , L 2 , L 3 of the three springs are equal to their natural, 
unstretched lengths l h l 2 , / 3 . However, this assumption is not needed, and the three springs could all be 
in tension (or compression) at the equilibrium position, (a) Find the relations among these six lengths 
(and the three spring constants k h k 2 , k 3 ) required for the two carts to be in equilibrium, (b) Show 
that the net force on either cart is exactly as given in Equation (11.2), irrespective of how L h L 2 , L 3 
compare with l h l 2 , l 3 , just as long as x x and x 2 are measured from the carts’ equilibrium positions. 

11.2 ** A massless spring (force constant k x ) is suspended from the ceiling, with a mass m x hanging 
from its lower end. A second spring (force constant k 2 ) is suspended from m h and a second mass m 2 
is suspended from the second spring’s lower end. Assuming that the masses move only in a vertical 
direction and using coordinates y x and y 2 measured from the masses’ equilibrium positions, show that 
the equations of motion can be written in the matrix form My = —Ky, where y is the 2x1 column 
made up of y 1 and y 2 . Find the 2 x 2 matrices M and K. 

section 11.2 Identical Springs and Equal Masses 

11.3 * Find the normal frequencies for the system of two carts and three springs shown in Figure 11.1, 
for arbitrary values of m x and m 2 and of k h k 2 , and k 3 . Check that your answer is correct for the case 
that m x = m 2 and k x — k 2 = k 3 . 

11.4 *★ (a) Find the normal frequencies for the system of two carts and three springs shown in Figure 
11.1, for the case that m x = m 2 and k x = k 3 , (but k 2 may be different). Check that your answer is correct 
for the case that k x = k 2 as well, (b) Find and describe the motion in each of the two normal modes in 
turn. Compare with the motion found for the case that k { = k 2 in Section 11.2. Explain any similarities. 

11.5 ** (a) Find the normal frequencies, o) x and co 2 , for the two carts shown in Figure 11.15, assuming 
that m | =m 2 and k x = k 2 . (b) Find and describe the motion for each of the normal modes in turn. 
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x l x 2 


Figure 11.15 Problems 11.5 and 11.6. 


11.6 *★ Answer the same questions as in Problem 11.5 but for the case that m , = m 2 and k x = 3k 2 /2. 
(Write ki = 3k and k 2 = 2k.) Explain the motion in the two normal modes. 

11.7** [Computer] The most general motion of the two carts of Section 11.2 is given by (11.21), 
with the constants A h A 2 , S h and S 2 determined by the initial conditions, (a) Show that (11.21) can be 
rewritten as 

x(t) = (B 1 cos cotf + CjsincD^) j j + (B 2 cos o) 2 t + C 2 sin co 2 t) ^j. 

This form is usually a little more convenient for matching to given initial conditions, (b) If the carts are 
released from rest at positions jcj(O) = x 2 (0) = A, find the coefficients B h B 2 , C h and C 2 and plot x x (t) 
and x 2 (t). Take A =co 1 = 1 and 0 < t < 30 for your plots, (c) Same as part (b), except that x,(0) = A 
but x 2 (0) = 0. 

11.8 ★★ [Computer] Same as Problem 11.7 but in part (b) the carts tire at their equilibrium positions at 
t = 0 and are kicked away from each other, each with speed v 0 . In part (c), the carts start out at their 
equilibrium positions and cart 2 has speed v 0 to the right but cart 1 has initial speed 0. Take v 0 = co l — 1 
and 0 < t < 30 for your plots. 

11.9** (a) Write down the equations of motion (11.2) for the equal-mass carts of Section 11.2 with 
three identical springs. Show that the change of variables to the normal coordinates c,\ = )(xj + x 2 ) 
and £ 2 = \{x x — x 2 ) leads to uncoupled equations for q and q 2 . (b) Solve for c, and c 2 and hence write 
down the general solution for x x and x 2 . (Notice how very simple this procedure is, once you have 
guessed what the normal coordinates are. For a simple symmetric system like this, you can sometimes 
guess the form of £j and £ 2 by considering the symmetry — especially once you have some experience 
working with normal modes.) 

11.10 ★★★ [Computer] In general, the analysis of coupled oscillators with dissipative forces is much 
more complicated than the conservative case considered in this chapter. However, there are a few cases 
where the same methods still work, as the following problem illustrates: (a) Write down the equations 
of motion corresponding to (11.2) for the equal-mass carts of Section 11.2 with three identical springs, 
but with each cart subject to a linear resistive force —by (same coefficient b for both carts), (b) Show that 
if you change variables to the normal coordinates ^ = )(xj + x 2 ) and q 2 = \{x x — x 2 ), the equations 
of motion for £ x and £ 2 are uncoupled, (c) Write down the general solutions for the normal coordinates 
and hence for Xj and x 2 . (Assume that b is small, so that the oscillations are underdamped.) (d) Find 
X|(/) and x 2 {t) for the initial conditions X](0) = A and x 2 (0) = iq(0) — u 2 (0) = 0, and plot them for 
0 < t < 10 tt using the values A — k = m = 1, and b = 0.1. 

11.11 (a) Write down the equations of motion corresponding to (11.2) for the equal-mass carts of 
Section 11.2 with three identical springs, but with each cart subject to a linear resistive force —by (same 
coefficient b for both carts) and with a driving force Fit) = F 0 cos cot applied to cart 1. (b) Show that if 
you change variables to the normal coordinates = ^(xj + x 2 ) and <; 2 = i(x l — x 2 ), the equations of 
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motion for ^ and q 2 are uncoupled, (c) Using the methods of Section 5.5, write down the general 
solutions, (d) Assuming that = b/2m <?C co 0 , show that ^ resonates when co ^ co 0 = -JkJm and 
likewise £ 2 when co M V3 <w 0 . (e) Prove, on the other hand, that if both carts are driven in phase with 
the same force F 0 cos cot, only ^ shows a resonance. Explain. 


j- 2.itTYTrrr 

um m ^ _l 

Figure 11.16 Problem 11.12 


11.12 ★+* Here is a different way to couple two oscillators. The two carts in Figure 11.16 have equal 
masses m (though different shapes). They are joined by identical but separate springs (force constant 
k ) to separate walls. Cart 2 rides in cart 1, as shown, and cart 1 is filled with molasses, whose viscous 
drag supplies the coupling between the carts. 

(a) Assuming that the drag force has magnitude fimv where v is the relative velocity of the 
two carts, write down the equations of motion of the two carts using as coordinates Xj and x 2 , the 
displacements of the carts from their equilibrium positions. Show that they can be written in matrix 
form as x + /3Dx + co^x = 0, where x is the 2 x 1 column made up of x l and x 2 , co 0 = s/k/m, and D 
is a certain 2x2 square matrix, (b) There is nothing to stop you from seeking a solution of the form 
x(t) = Re z(t), with z(t) = ae rt . Show that you do indeed get two solutions of this form with r = ico 0 
or r = — + ico\ where co l = ^joj~ — fi 2 . (Assume that the viscous force is weak, so that fi < co 0 .) 
(c) Describe the corresponding motions. Explain why one of these modes is damped but the other 
is not. 

section 11 .3 Two Weakly Coupled Oscillators 

11.13 *** [Computer] Consider the two carts of Section 11.3, coupled by a weak spring and subject 
to a resistive force —by (same force for each cart), (a) Write down the equations of motion for x, 
and x 2 in the form (11.2) and show that if you change to the normal coordinates = )(xj + x 2 ) and 
£ 2 = )(X| - x 2 ) the equations of motion for cjj and £ 2 are uncoupled, (b) Solve for £,(r) and £ 2 (0 
assuming that the dissipative coefficient b is small (“underdamped motion,” as in Section 5.4), and 
hence write down the general solution for Xj(t) and x 2 (t). (c) Suppose that cart 1 is held at x x = A and 
cart 2 at x 2 = 0, and they are released from rest at t = 0. Find and plot the two positions as functions 
of t for 0 < t < 80, using the values A = k = m — \, k 2 = 0.2, and b — 0.04. (In matching the initial 
conditions, take advantage of the fact that b <$ Cl, and use a suitable computer program to make the 
plot.) Comment on your plots. 

section 11.4 Lagrangian Approach: The Double Pendulum 

11.14 *★ Consider two identical plane pendulums (each of length L and mass m) that are joined by a 
massless spring (force constant k ) as shown in Figure 11.17. The pendulums’ positions are specified 
by the angles <p { and cp 2 shown. The natural length of the spring is equal to the distance between the 
two supports, so the equilibrium position is at cp { = c/) 2 = 0 with the two pendulums vertical, (a) Write 
down the total kinetic energy and the gravitational and spring potential energies. [Assume that both 
angles remain small at all times. This means that the extension of the spring is well approximated by 
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Figure 11.17 Problem 11.14 


L(0 2 — 0i)-] Write down the Lagrange equations of motion, (b) Find and describe the normal modes 
for these two coupled pendulums. 

11.15 ** Write down the exact Lagrangian (good for all angles) for the double pendulum of Figure 
11.9 and find the corresponding equations of motion. Show that they reduce to Equations (11.41) and 
(11.42) if both angles are small. 

11.16 *★ (a) Find the normal frequencies for small oscillations of the double pendulum of Figure 11.9 
for arbitrary values of the masses and lengths, (b) Check that your answers are correct for the special 
case that m l = m 2 and L, = L 2 . (c) Discuss the limit that m 2 -+ 0. 

11.17** (a) Find the normal freqencies and modes of the double pendulum of Figure 11.9, given 
that mj = 8 m,m 2 = rn, and L x = L 2 = L. (b) Find the actual motion \4>\(t), 0 2 (t)] if the pendulum is 
released from rest with 0 j = 0 and 0 2 = a - Is this motion periodic? 

11.18 ** Two equal masses m are constrained to move without friction, one on the positive x axis and 
one on the positive y axis. They are attached to two identical springs (force constant k) whose other 
ends are attached to the origin. In addition, the two masses are connected to each other by a third spring 
of force constant k!. The springs are chosen so that the system is in equilibrium with all three springs 
relaxed (length equal to unstretched length). What are the normal frequencies? Find and describe the 
normal modes. 



Figure 11.18 Problem 11.19 


11.19 *** A simple pendulum (mass M and length L) is suspended from a cart (mass m) that can 
oscillate on the end of a spring of force constant k, as shown in Figure 11.18. (a) Assuming that the 
angle 0 remains small, write down the system’s Lagrangian and the equations of motion for x and 0. 
(b) Assuming that m = M = L = g—l and k = 2 (all in appropriate units) find the normal frequencies, 
and for each normal frequency find and describe the motion of the corresponding normal mode. 
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11.20*** (a) A thin uniform rod of length 2b is suspended by two vertical light strings, both of 
fixed length l, fastened to the ceiling. Assuming only small displacements from equilibrium, find the 
Lagrangian of the system and the normal frequencies. Find and describe the normal modes. [Hint: A 
possible choice of generalized coordinates would be x, the longitudinal displacement of the rod, and 
y x and y 2 , the sideways displacements of the rod’s two ends. You’ll need to find how high the two ends 
are above their equilibrium height and what angle the rod has turned through.] 

sections 11.5 and 11.6 The General Case & Three Coupled Pendulums 

11.21 ★ Verify that if U = \ J2j Sit Kj k qjq k , where the coefficients K jk are all constant and satisfy 
Kjj = Kj h then dU/dq t = KijQj’ as claimed in Equation (11.58). 

11.22 * Write down the exact potential energy of the three pendulums of Figure 11.13, good for all 
angles, small or large, and show that your answer reduces to (11.68) if all three angles are small. 

11.23** Equation (11.73) gives the three normal frequencies of three coupled pendulums in natural 
units with L = m = 1. We have already seen that the value of cof in arbitrary units is g/L. Find the 
values of and ft> 3 2 in arbitrary units. [Hint: Start by considering the quantity — &> 2 .J 



Figure 11.19 Problem 11.24 


11.24 ** Two equal masses m move on a frictionless horizontal table. They are held by three identical 
taut strings (each of length L, tension T), as shown in Figure 11.19, so that their equilibrium position 
is a straight line between the anchors at A and B. The two masses move in the transverse (y) direction, 
but not in the longitudinal (x) direction. Write down the Lagrangian for small displacements, and find 
and describe the motion in the corresponding normal modes. [Hint: “Small” displacements have V| and 
y 2 much less than L, which means that you can treat the tensions as constant. Therefore the PE of each 
string is just Td, where d is the amount by which its length has increased from equilibrium.] 

11.25 *★ Consider a system of carts and springs like that in Figure 11.1 except that there are three equal- 
mass carts and four identical springs. Solve for the three normal frequencies, and find and describe the 
motion in the corresponding normal modes. 

11.26 ** A bead of mass m is threaded on a frictionless circular wire hoop of radius R and mass m 
(same mass). The hoop is suspended at the point A and is free to swing in its own vertical plane as 
shown in Figure 11.20. Using the angles <pi and <p 2 as generalized coordinates, solve for the normal 
frequencies of small oscillations, and find and describe the motion in the corresponding normal modes. 
[Hint: The KE of the hoop is j/0 2 , where / is its moment of inertia about A and can be found using 
the parallel axis theorem.] 

11.27 ** Consider two equal-mass carts on a horizontal, frictionless track. The carts are connected to 
each other by a single spring of force constant k, but are otherwise free to move freely along the track, 
(a) Write down the Lagrangian and find the normal frequencies of the system. Show that one of the 
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normal frequencies is zero, (b) Find and describe the motion in the normal mode whose frequency is 
nonzero, (c) Do the same for the mode with zero frequency. [Hint: This one requires some thought. 
It isn’t immediately clear what oscillations of zero frequency are. Notice that the eigenvalue equation 
(K — ft> 2 M)a = 0 reduces to Ka = 0 in this case. Consider a solution x(r) = a fit), where f(t ) is an 
undetermined function of t and use the equation of motion, Mx = — Kx, to show that this solution 
represents motion of the whole system with constant velocity. Explain why this kind of motion is 
possible here but not in the previous examples.] 

11.28 ** A simple pendulum (mass M and length L) is suspended from a cart of mass m that moves 
freely along a horizontal track. (See Figure 11.18, but imagine the spring removed.) (a) What are 
the normal frequencies? (b) Find and describe the corresponding normal modes. [See the hint for 
Problem 11.27], 

11.29 A thin rod of length 2b and mass m is suspended by its two ends with two identical vertical 
springs (force constant k ) that are attached to the horizontal ceiling. Assuming that the whole system 
is constrained to move in just the one vertical plane, find the normal frequencies and normal modes of 
small oscillations. Describe and explain the normal modes. [Hint: It is crucial to make a wise choice 
of generalized coordinates. One possibility would be r, 0, and a, where r and 0 specify the position 
of the rod’s CM relative to an origin half way between the springs on the ceiling, and a is the angle of 
tilt of the rod. Be careful when writing down the potential energy.] 

11.30 **★ [Computer] Consider a system of carts and springs like that in Figure 11.1 except that there 
ar e four equal-mass carts and five identical springs. Solve for the four normal frequencies, and find 
and describe the motion in the corresponding normal modes. [This can be solved without the aid of 
a computer, but it would probably be worth your while to learn how to evaluate determinants and to 
solve the characteristic equation using suitable computer software.] 

11.31 *★* Consider a frictionless rigid horizontal hoop of radius R . Onto this hoop I thread three beads 
with masses 2m, m, and m, and, between the beads, three identical springs, each with force constant 
k. Solve for the three normal frequencies and find and describe the three normal modes. 

11.32 **★ As a model of a linear triatomic molecule (such as C0 2 ), consider the system shown in 
Figure 11.21, with two identical atoms each of mass m connected by two identical springs to a single 
atom of mass M. To simplify matters, assume that the system is confined to move in one dimension, 
(a) Write down the Lagrangian and find the normal frequencies of the system. Show that one of the 
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normal frequencies is zero, (b) Find and describe the motion in the normal modes whose frequencies 
are nonzero, (c) Do the same for the mode with zero frequency. [Hint: See the comments at the end of 
Problem 11.27.] 

section 11 .7 Normal Coordinates 

11.33 * The eigenvectors a (lj and a (2) that describe the motion in the two normal modes of the two 
carts of Section 11.2 are given in (11.78). Prove that any (2 x 1) column x can be written as a linear 
combination of these two eigenvectors; that is, a (1) and a (2) are a basis of the space of (2 x 1) columns. 

11.34 ** It is a crucial property of the eigenvectors, a (1) , ■ • •, a (n) , describing the motion in the normal 
modes of an oscillating system that any (n x 1) column x can be written as a linear combination of 
the n eigenvectors; that is, the eigenvectors are a basis of the space of (n x 1) columns. This is proved 
in the appendix, but to illustrate it, do the following: (a) Write down the three eigenvectors a ri) , a ( - 2) , 
and a {3) for the coupled pendulums of Section 11.6. (Each of these contains an arbitrary overall factor, 
which you can choose at your convenience.) Prove that they have the property that any (3 x 1) column 
x can be expanded in terms of them, (b) The expansion coefficients in this expansion are the normal 
coordinates £ l5 £ 2 , £ 3 . Find the normal coordinates for the three coupled pendulums, and explain the 
sense in which they describe the three normal modes of Figure 11.14. 

11.35 ** Consider the two coupled pendulums of Problem 11.14. (a) What would be a natural choice 
for the normal coordinates and £ 2 ? (b) Show that even if both pendulums are subject to a resisitive 
force of magnitude bv (with b small), the equations of motion for and < 2 are still uncoupled, (c) Find 
and describe the motion of the two pendulums for the two modes. 
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Part I of this book contains the material that is in some sense essential. Part II contains 
mostly more advanced topics that could be considered optional — although all are of 
the greatest importance in modem physics. While the chapters of Part I were designed 
to be read pretty much in sequence, I have tried to make the five chapters of Part II 
mutually independent, so that you could pick and choose according to your tastes and 
available time. If you understand most of the material of Part I, you are ready to launch 
into any of the chapters of Part II. This is not to say that you have to have studied all 
of Part I before reading any of Part II. For example, you could read Chapter 12 on 
chaos as soon as you have studied Chapter 5 on oscillations. Similarly, you could read 
Chapter 13 on Hamiltonian mechanics immediately after Chapter 7 on Lagrangian 
mechanics, and likewise. Chapter 14 on collision theory immediately after Chapter 8 
on two-body motion. 
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II 


Nonlinear Mechanics and Chaos 


One of the most fascinating and exciting discoveries of the last few decades has 
been the recognition that most systems whose equations of motion are nonlinear can 
exhibit chaos. This startling phenomenon, which shows up in many different areas — 
oscillating mechanical systems, chemical reactions, fluid flow, lasers, population 
growth, the spread of diseases, and many more — means that although a system obeys 
deterministic equations of motion (such as Newton’s laws) its detailed future behavior 
may, as a practical matter, be unpredictable. 

The behavior of a chaotic mechanical system can be very complicated, and the 
need to describe this behavior has spawned a whole array of new ways to view the 
motion of such systems — state-space orbits, Poincare sections, bifurcation diagrams. 
Fortunately, there are systems that are complicated enough to exhibit chaos, but still 
simple enough not to need all these new tools for their description. In particular, a 
driven damped pendulum, whose equation of motion is nonlinear, can exhibit chaos 
but can be described in reasonably elementary terms. For this reason, I shall mostly 
focus on this system here. After a brief review of the difference between linear and 
nonlinear equations and of the properties of a damped linear oscillator, I shall describe 
in some detail the motion of a driven damped pendulum, using just straightforward 
graphs of its position </> against time. Once you are familiar with the basic phenomena, 
I shall give an introduction to some of the more sophisticated tools that you will need 
if you want to explore the rapidly expanding literature on chaos. To conclude this 
chapter I shall describe another system that can exhibit chaos — the so-called logistic 
map. Although this is not strictly part of mechanics, it shows many strong parallels 
with mechanical systems, and has the great advantage of a simplicity that allows an 
easy understanding of several aspects of its behavior. It also serves to illustrate some 
of the amazing universality of chaos. 

In planning this chapter, I felt strongly that it was important to convey a good 
understanding of a few topics, rather than a superficial glimpse of the whole field. In 
particular, chaos theory is divided into two broad areas — dissipative systems, such as 
the damped pendulum, and nondissipative, or “Hamiltonian” systems — and I decided 
to restrict myself entirely to the former. I know that some readers will question this 
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decision. Many important applications of chaos theory (in astronomy and statistical 
mechanics, for instance) concern nondissipative systems, as did the pioneering work 
of Poincare. Nevertheless, it is my experience that the most accessible topics for 
the beginner concern dissipative systems, and, to keep the length of this chapter 
reasonably finite, I decided to treat only these. Please see this chapter as a sampler of 
the good things in chaos theory; it certainly makes no claim to tell the whole story. 

This chapter is very different from all the other chapters of this book. The theory 
of chaos is new and not at all elementary. (And parts of the theory have yet to be 
discovered!) It requires a much deeper understanding of differential equations than 
I am assuming in this book, and a proper exposition of chaos theory requires a 
whole book rather than one chapter. 1 Therefore, I shall restrict myself here to simply 
describing the fascinating main properties of chaotic motion, without much attempt to 
prove that the motion is as I claim. This is actually a reasonably satisfactory situation. 
Before you try to read any of the more advanced books, it is almost certainly a good 
thing to have some idea of what chaos involves and some familiarity with the tools 
used to describe it, and these are what I hope to communicate. 


12.1 Linearity and Nonlinearity 


For a system to exhibit chaos its equations of motion must be nonlinear. We have 
noted examples of linear and nonlinear equations from time to time in this book, but 
let us review the two concepts now. A differential equation is linear if it involves the 
dependent variable or variables and their derivatives only linearly. The equation of 
motion of a cart (mass m) on a spring (force constant k), 

mx = —kx, (12.1) 

is a linear differential equation for the cart’s position x. Similarly, the equations of 
motion for the two carts discussed in Chapter 11 [Equations (11.2) for example] are 
linear equations for the two carts’ positions x x and x 2 . If we apply a driving force Fit) 
to the cart of Equation (12.1), the resulting equation, 

mx = -kx + F{t), (12.2) 

is still linear [though no longer homogeneous, since the “inhomogeneous” term F(t) 
does not involve the dependent variable x at all]. By contrast, the equation of motion 
for a simple pendulum (mass m, length L) is 70 = T or 

mL 2 0 = —mgL sin 0, (12.3) 

which is a nonlinear equation for 0, since sin 0 is not linear in 0. (If the oscillations 
are small, then sin 0 ~ 0, and the equation is well approximated by a linear equation; 


1 Several such books exist, of which my favorite is Nonlinear Dynamics and Chaos by Steven 
H. Strogatz, Addison-Wesley, Reading, MA (1994), but be warned, it takes eight chapters of 
mathematical preliminaries to get to the chaos. 
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in general, however, the equation for the simple pendulum is definitely nonlinear.) 
Another example is the equation of motion of a single planet in the field of the sun, 

m f =z -GmMr/r 2 , (12.4) 

which is a nonlinear equation for the variables r = {x, y, because the force term is 
nonlinear in *, y, and z. These two examples show that nonlinear equations are not 
especially unusual. On the contrary, many perfectly everyday systems have equations 
of motion that are nonlinear. 

So far in this book the main difference between linear and nonlinear differential 
equations has been that the former have been easily solved analytically, whereas most 
of the latter have been impossible to solve analytically. In fact, our experience in this 
regard reflects the true state of affairs: Almost all of the linear equations of mechanics 
arc analytically solvable, and almost none of the nonlinear ones are. 2 This circum¬ 
stance is largely to blame for the failure until recently of scientists to recognize that 
chaos is an important and widespread phenomenon. Because nonlinear equations are 
so intractable, textbooks always focussed on linear problems. When nonlinear prob¬ 
lems had to be addressed, they were often solved using approximations that reduced 
then to linear problems. In this way, the astonishingly rich variety of complications 
that occur for nonlinear systems went almost completely unrecognized. The first per¬ 
son to notice some of the symptoms of chaos was the French mathematician Poincare 
(1854-1912) in his studies of the gravitational three-body problem-the motion of 
three bodies (such as the sun, earth, and moon) interacting via the gravitational force. 
The equation of motion for this system is nonlinear, like its two-body counterpart 
(12.4), and Poincare observed that it exhibits the phenomenon now called sensitivity 
to initial conditions that is one of the characteristics of chaotic motion, as we shall 

That Poincare’s observation of chaos went nearly unnoticed by physicists until the 
1970s is probably due to several factors. The discoveries of relativity (1905) and then 
of quantum mechanics (around 1925) diverted most physicists’ attention away from 
classical mechanics. And the difficulty of solving nonlinear equations without the aid 
of computers certainly discouraged the pursuit of nonlinear problems. In any case it 
was only in the 1970s that computer solutions of various nonlinear problems 3 drew 
the attention of significant numbers of scientists (physicists and many others) to the 
phenomenon that we now call chaos. 

Nonlinearity is essential for chaos-if a system’s equations of motion are linear, 
it cannot exhibit chaos. But nonlinearity does not guarantee chaos. For example, the 
equation (12.3) for a simple pendulum is nonlinear, but even when the amplitude is 
large (and the linear approximation is definitely not good) the simple pendulum never 


2 One of the rare examples of a solvable nonlinear equation is (12.4) for a planet, whose orbit 
we found in Chapter 8. But notice that we did this by a cunning change of variables that reduced 
the nonlinear equation (8.37) for r to the linear equation (8.45) for u. 

3 The first such calculation, of atmospheric convection, was made by the meteorologist Edward 
Lorenz at MIT in 1963, but this work did not attract widespread attention for another decade. For an 
exhaustive, but very readable, history of chaos theory see Chaos , Making a New Science by James 
Gleick, Viking-Penguin, New York (1987). 
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decision. Many important applications of chaos theory (in astronomy and statistical 
mechanics, for instance) concern nondissipative systems, as did the pioneering work 
of Poincare. Nevertheless, it is my experience that the most accessible topics for 
the beginner concern dissipative systems, and, to keep the length of this chapter 
reasonably finite, I decided to treat only these. Please see this chapter as a sampler of 
the good things in chaos theory; it certainly makes no claim to tell the whole story. 

This chapter is very different from all the other chapters of this book. The theory 
of chaos is new and not at all elementary. (And parts of the theory have yet to be 
discovered!) It requires a much deeper understanding of differential equations than 
I am assuming in this book, and a proper exposition of chaos theory requires a 
whole book rather than one chapter. 1 Therefore, I shall restrict myself here to simply 
describing the fascinating main properties of chaotic motion, without much attempt to 
prove that the motion is as I claim. This is actually a reasonably satisfactory situation. 
Before you try to read any of the more advanced books, it is almost certainly a good 
thing to have some idea of what chaos involves and some familiarity with the tools 
used to describe it, and these are what I hope to communicate. 


12.1 Linearity and Nonlinearity 


For a system to exhibit chaos its equations of motion must be nonlinear. We have 
noted examples of linear and nonlinear equations from time to time in this book, but 
let us review the two concepts now. A differential equation is linear if it involves the 
dependent variable or variables and their derivatives only linearly. The equation of 
motion of a cart (mass m ) on a spring (force constant k), 

mx = -kx, (12.1) 

is a linear differential equation for the cart’s position x. Similarly, the equations of 
motion for the two carts discussed in Chapter 11 [Equations (11.2) for example] are 
linear equations for the two carts’ positions Xj and x 2 . If we apply a driving force F(t) 
to the cart of Equation (12.1), the resulting equation, 

mx = —kx + F(t), (12.2) 

is still linear [though no longer homogeneous, since the “inhomogeneous” term F(t) 
does not involve the dependent variable x at all]. By contrast, the equation of motion 
for a simple pendulum (mass m, length L) is 70 = T or 

mL 2 <p = —mgL sin 0, (12.3) 

which is a nonlinear equation for 0, since sin 0 is not linear in 0. (If the oscillations 
are small, then sin 0 ^ 0, and the equation is well approximated by a linear equation; 


1 Several such books exist, of which my favorite is Nonlinear Dynamics and Chaos by Steven 
H. Strogatz, Addison-Wesley, Reading, MA (1994), but be warned, it takes eight chapters of 
mathematical preliminaries to get to the chaos. 
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in general, however, the equation for the simple pendulum is definitely nonlinear.) 
Another example is the equation of motion of a single planet in the field of the sun, 

mr = —GmMr/r 2 , (12.4) 

which is a nonlinear equation for the variables r = (x,y,z) because the force term is 
nonlinear in x, y, and z. These two examples show that nonlinear equations are not 
especially unusual. On the contrary, many perfectly everyday systems have equations 
of motion that are nonlinear. 

So far in this book the main difference between linear and nonlinear differential 
equations has been that the former have been easily solved analytically, whereas most 
of the latter have been impossible to solve analytically. In fact, our experience in this 
regard reflects the true state of affairs: Almost all of the linear equations of mechanics 
are analytically solvable, and almost none of the nonlinear ones are. 2 This circum¬ 
stance is largely to blame for the failure until recently of scientists to recognize that 
chaos is an important and widespread phenomenon. Because nonlinear equations are 
so intractable, textbooks always focussed on linear problems. When nonlinear prob¬ 
lems had to be addressed, they were often solved using approximations that reduced 
then to linear problems. In this way, the astonishingly rich variety of complications 
that occur for nonlinear systems went almost completely unrecognized. The first per¬ 
son to notice some of the symptoms of chaos was the French mathematician Poincare 
(1854-1912) in his studies of the gravitational three-body problem — the motion of 
three bodies (such as the sun, earth, and moon) interacting via the gravitational force. 
The equation of motion for this system is nonlinear, like its two-body counterpart 
(12.4), and Poincare observed that it exhibits the phenomenon now called sensitivity 
to initial conditions that is one of the characteristics of chaotic motion, as we shall 
see. 

That Poincare’s observation of chaos went nearly unnoticed by physicists until the 
1970s is probably due to several factors. The discoveries of relativity (1905) and then 
of quantum mechanics (around 1925) diverted most physicists’ attention away from 
classical mechanics. And the difficulty of solving nonlinear equations without the aid 
of computers certainly discouraged the pursuit of nonlinear problems. In any case it 
was only in the 1970s that computer solutions of various nonlinear problems 3 drew 
the attention of significant numbers of scientists (physicists and many others) to the 
phenomenon that we now call chaos. 

Nonlinearity is essential for chaos — if a system’s equations of motion are linear, 
it cannot exhibit chaos. But nonlinearity does not guarantee chaos. For example, the 
equation (12.3) for a simple pendulum is nonlinear, but even when the amplitude is 
large (and the linear approximation is definitely not good) the simple pendulum never 


2 One of the rare examples of a solvable nonlinear equation is (12.4) for a planet, whose orbit 
we found in Chapter 8. But notice that we did this by a cunning change of variables that reduced 
the nonlinear equation (8.37) for r to the linear equation (8.45) for u. 

3 The first such calculation, of atmospheric convection, was made by the meteorologist Edward 
Lorenz at MIT in 1963, but this work did not attract widespread attention for another decade. For an 
exhaustive, but very readable, history of chaos theory see Chaos, Making a New Science by James 
Gleick, Viking-Penguin, New York (1987). 



460 


Chapter 12 Nonlinear Mechanics and Chaos 


exhibits chaos. On the other hand, if we add in a damping force — bv = —bL4> and a 
driving force F(t), (12.3) becomes the equation of the driven, damped pendulum : 

mL 2 4> = -mgL sin 0 - bL 2 <p + LF(t) (12.5) 

and this equation does lead to chaos for some values of the parameters. Loosely 
speaking, the requirement for chaos is that the equations of motion be nonlinear and 
somewhat complicated. Equation (12.3) for the simple pendulum is not sufficiently 
complicated, but Equation (12.5) for the driven damped pendulum is. Unfortunately, 
a discussion of precisely what is “sufficiently complicated” to produce chaos would 
be well beyond the scope of this book. 4 

Another relatively simple example of a nonlinear system that exhibits chaos is the 
double pendulum of Section 11.4. In the small angle approximation, the equations of 
motion for the double pendulum are linear equations for the two angles <p x and 0 2 [see 
(11.41) and (11.42)], but in general they are nonlinear (see Problem 11.15), and they 
are sufficiently complicated to produce chaos. The driven damped pendulum and the 
double pendulum are two of the simplest mechanical systems that exhibit chaos. The 
driven damped pendulum has just one degree of freedom (one coordinate, 0, needed 
to specify its configuration), whereas the double pendulum has two (two coordinates, 
0! and 0 2 , needed). For this reason the driven damped pendulum is the simpler one to 
analyze and will be the main focus of our discussion here. 


What’s Special about Nonlinearity? 

In the enormous set of all possible differential equations, the linear equations form a 
miniscule subset, with many simple properties that are not shared by the general non¬ 
linear equation. Thus it is really the linear equations which are “special.” Nevertheless, 
for the reasons already mentioned, many physicists are much more conversant with 
the linear case, and they are sometimes tempted to assume that familiar properties of 
linear equations will carry over to nonlinear equations. This dangerous assumption is 
frequently wrong. In particular, the main message of this chapter is that chaos, which 
never appears in linear systems, is a common occurrence in nonlinear systems. Unfor¬ 
tunately, the underlying theory of this particular difference is beyond the scope of this 
book, and we shall have to be content with seeing some simple examples of chaotic 
motion, without a detailed understanding of why chaos occurs. Here, I would like to 
mention just one huge difference between linear and nonlinear equations that high- 


4 For the record, the criterion is this: As we shall see in Chapter 13, a set of second-order 
differential equations (like Newton’s second law) for n variables can usually be rewritten as a 
set of first-order equations for N variables, • • •, £ N where N > n, with the general form |§ = 
, $ N ) for i = 1, • • •, N. For instance, if we write 0 = to, the one equation (12.3) for the 
angle 0 of the simple pendulum becomes two first-order equations, one for 0 and one for to, namely 
0 = to and to = — (g/L) sin 0. When the right-hand sides of these equations are independent of t 
(as they are here) the equations are said to be autonomous. For a dissipative system to exhibit chaos, 
its equations of motion, when put in this standard autonomous form, must be nonlinear and have N 
variables with N > 3. Nondissipative systems need nonlinearity and N >4. 
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lights the importance of not letting the linear prejudice that many of us share mislead 
us in our study of nonlinear equations. 


Nonlinear Equations Don’t Obey the Superposition Principle 

We saw in Chapter 5 that linear homogeneous equations satisfy the superposition 
principle — that any linear combination of solutions gives us another solution. We 
have used this result several times, particularly in Chapters 5 and 11, but let me refresh 
your memory with the example of a second-order equation of the form 

p(t)x(t) + q(t)x(t) + r(t)x(t) = 0, (12.6) 

where x(t) is the unknown and pit), q{t), and r(t) are known fixed functions. [An 
example of such an equation is (12.1) for a cart on a spring.] Notice first that, because 
every term in this equation is linear in x (t) (or its derivatives), we can multiply through 
by any constant a and see at once that if x(t) is a solution, then so is ax(t). Second, if 
Xj(0 and x 2 (t) are both solutions, then we can add the two corresponding equations, 
oneforx T (0 andoneforx 2 (t), and conclude that x x {t) + x 2 (t) is also a solution. Thus 
any linear combination 


x(t) = a 1 x 1 (r) + a 2 x 2 (t) 

is also a solution of (12.6) — the result called the superposition principle. On the other 
hand, it is easy to see that neither of the arguments just given works if the equation is 
nonlinear. [Make sure you can see this. Suppose, for example, the last term in (12.6) 
was r(t)VW)', see Problem 12.3.] Therefore, the superposition principle does not 
apply to nonlinear equations. 

An important consequence of the superposition principle that we have used repeat¬ 
edly is this : To find all the solutions of (12.6) we have only to find two independent 
solutions Xj(t) and x 2 (t ); then every solution can be expressed as a linear combination 
of Xj(t) and x 2 (t). More generally, to find all the solutions of an nth order homoge¬ 
neous linear differential equation, we have only to find n independent solutions and 
then every solution can be expressed as a linear combination of these n solutions. 
Since the superposition principle does not apply to nonlinear equations, this dramatic 
simplification does not apply to nonlinear equations. 

There is a corresponding situation for inhomogeneous equations, such as (12.2) 
and (12.5), again as we saw in Chapter 5. If x p (t) is any one particular solution 
of a linear nth order inhomogeneous equation, then every solution can be written 
as x p (r) plus a linear combination of n independent solutions of the corresponding 
homogeneous equation. For nonlinear equations, there is no corresponding result 
(Problem 12.4). Thus every solution of any nth order linear equation (homogeneous 
or inhomogeneous) can be expressed simply in terms of n independent functions, but 
for nonlinear equations there is no such simple expression. 

With these general observations on nonlinear equations, let us take up the one 
nonlinear equation that we shall discuss in detail, Equation (12.5) for a driven damped 
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pendulum. I shall first describe some features of its motion that we would expect (or at 
least that are not wholly unexpected), and I shall then take up the surprising features 
associated with the pendulum’s chaotic motion. 


12.2 The Driven Damped Pendulum DDP 


The equation of motion for the driven damped pendulum (or DDP) was given in 
(12.5). Since this equation is going to occupy us for several sections to come, I would 
like to make quite sure that you are clear where it came from and to tidy it up. The 
pendulum is sketched in Figure 12.1. The equation of motion is just I4> = T, where I 
is the moment of inertia and T is the net torque about the pivot. In this case / = ml 2 , 
and the torque arises from the three forces shown in Figure 12.1. The resistive force 
has magnitude bv and hence exerts a torque — Lbv = —bL 2 <p. The torque of the weight 
is —mgL sin 0, and that of the driving force is LF(t). Thus the equation of motion 
l'(j) = T is 

mL 2 (j) = —bL 2 <p — mgL sin </> + LF(t ) (12.7) 


exactly as in (12.5). 

Throughout this chapter I shall assume that the driving force F(t) is sinusoidal, 
specifically that 

F(t) = F 0 cos(cot) (12.8) 

where F 0 is the drive amplitude (the amplitude of the driving force) and co the 
drive frequency. As I argued in Chapter 5, several real and interesting driving 
forces approximate this sinusoidal form quite closely, and it has proved possible to 
reproduce such sinusoidal forces with remarkable precision for experiments on chaos. 
Substituting into (12.7) and reorganizing a little, we find that 

b ■ g F n 

(j) 4 -0 + — sin0 = —— cos cot . (12.9) 

m L mL 



Figure 1 2. 1 The three important forces on the driven damped pen¬ 
dulum are the resistive force with magnitude bv, the weight mg, 
and the driving force F(t). (There is also a reaction force from the 
pivot at the top, but this contributes nothing to the torque.) 
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In this equation, you will recognize the coefficient b/m as the constant that we 
renamed as 2^ in Chapter 5, 



m 


where /3 was called the damping constant. Similarly the coefficient g/L is just <n 2 , 



where a> 0 is the natural frequency of the pendulum. Finally, the coefficient FJmL 
must have the dimensions of (time) -2 ; that is, F 0 /mL has the same units as o> 2 . It 
is convenient to rewrite this coefficient as FJmL = ya> 2 . That is, we introduce a 
dimensionless parameter 


Y = = ~ , (12-10) 

mLa)£ mg 

which I shall call the drive strength and is just the ratio of the drive amplitude F 0 to 
the weight mg. This parameter y is a dimensionless measure of the strength of the 
driving force. When y < 1, the drive force is less than the weight and we would expect 
it to produce a relatively small motion. (For instance, the drive force is insufficient to 
hold the pendulum out at 0 = 90°.) Conversely, if y > 1, the drive force exceeds the 
pendulum’s weight, and we should anticipate that it will produce larger scale motions 
(for instance, motion in which the pendulum is pushed all the way over the top at 
0 = 70- 

Making all these substitutions, we get our final form of the equation of motion 
(12.9) for a driven damped pendulum 


4> + 2/J0 4- (o* sin 0 = ya>~ cos ait. 


( 12 . 11 ) 


This is the equation whose solutions we shall be studying for the next several 
sections. 


12.3 Some Expected Features of the DDP _ 

Properties of the Linear Oscillator 

To appreciate the extraordinary richness of the chaotic motion of our driven damped 
pendulum, we must first review what sort of behavior we might expect, based on our 
experiences with linear oscillators. Specifically, if we release the pendulum near the 
equilibrium position 0 = 0 with a small initial velocity and if the drive strength is 
small, y 1, we would expect 0 to remain small at all times. Thus we should be 
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Figure 12.2 The motion of a DDP with a relatively weak drive strength 
of y = 0.2. The drive period was chosen to be r = 1, so that the hori¬ 
zontal axis gives the time in units of the drive period. You can see clearly 
that after about two cycles the motion has settled down to a perfectly 
sinusoidal motion with period equal to the drive period. 


able to approximate the term sin (p in (12.11) as (p, and the equation of motion would 
become the linear equation 

<p + 2/10 + co^(p = yco^ cos cot (12.12) 

which has exactly the form of (5.57) for the linear oscillator of Chapter 5. Thus the 
“expected” behavior of the driven damped pendulum, at least for a weak enough 
driving force, is just the behavior described in Section 5.5. This behavior can be 
quickly summarized: The initial behavior of the pendulum depends on the initial 
conditions, but any differences (or “transients”) due to the initial conditions die out 
rapidly, and the motion approaches a unique “attractor,” in which the pendulum 
oscillates sinusoidally with exactly the frequency of the driving force: 

(pit) = Acos(cot - 8). (12.13) 

These predictions are nicely illustrated in Figure 12.2, which shows the actual 
motion of the driven damped pendulum for a fairly weak drive strength of y — 0.2. 
[Since the exact equation of motion (12.11) cannot be solved analytically, this and all 
subsequent plots of the motion of the DDP were made from numerical solutions of 
(12.11). 5 ] The drive frequency was chosen to be co = 2tx, so that the drive period is 
r = 2tx/u> = 1. This means that the horizontal axis shows time in units of the drive 
period.The natural frequency was chosen to be co Q = 1.5 co, so that the system is fairly 
close to resonance, since this is where chaotic motion is usually easiest to find. 6 The 
most striking feature of this plot is that after about two cycles the motion has settled 
down to a purely sinusoidal motion with exactly the period of the driving force, r = 1. 
The initial conditions chosen for this plot were that (p = (p = 0 at t — 0. It is a fact 
(though not one that our one plot can show) that, whatever initial conditions we were 
to choose, the motion of a linear oscillator would always approach the same unique 
attractor as the initial transients die out. 


5 All of these plots were made using Mathematica’s numerical solver NDSolve. For many plots 
the default precision of 15 digits was more than sufficient, but where there was any reason for doubt, 
the precision was increased in integer steps until two successive calculations were indistinguishable. 

6 The other parameters used were as follows: Damping constant (3 = <w 0 /4, and initial conditions 
<p = (j> = 0 at % m 0. 
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To summarize, for a linear damped oscillator, with a sinusoidal driving force: (1) 
There is a unique attractor which the motion approaches, irrespective of the chosen 
initial conditions. (2) The motion of this attractor is itself sinusoidal with frequency 
exactly equal to the drive frequency. 

Nearly Linear Oscillations of the DDP 

Let us now imagine increasing the drive strength so that the amplitude of oscillation 
increases to a value where the approximation 

sin 0 ~ 0 

is no longer satisfactory. As long as the amplitude is not too large, we would expect to 
get a satisfactory approximation by including just one more term in the Taylor series 
for sin 0 and writing 

sin 0 ~ 0 — g0 3 . 

If we use this approximation in the exact equation of motion (12.11), we get the 
approximate equation 

0 + 2/10 + m 2 ^0 — g0 3 ^ = yco^ cos cot. (12.14) 

To the extent that the new nonlinear term involving 0 3 is small, we can anticipate that 
the solution of this equation will still be reasonably approximated (once the transients 
have died out) by an expression of the same form as before, 

0(0 ^ Acos(cot - 8). 

When this is put into (12.14), the small term involving 0 3 contributes a term propor¬ 
tional to cos 3 (cot — 8). Since 

cos 3 x = |(cos3x + 3 cosjc) (12.15) 

(see Problem 12.5) there is now a small term on the left side of (12.14) proportional 
to cos 3 (cot — 5). Since the right side contains no terms with this time dependence, it 
follows that at least one of the terms on the left (0,0, or 0, and in fact all three) must. 
That is, a more exact expression for 0(f) must have the form 

0(f) = Acos(cot — 8) 4- Bcos3(cot — 8), (12.16) 

with B much smaller than A. Therefore, we must anticipate that, as we increase the 
driving force and the amplitude increases, the solution will pick up a small term that 
oscillates with frequency 3 co. 

We can repeat this argument: If we substitute the improved solution (12.16) back 
into (12.14), then the term in 0 3 will give even smaller terms of the form cos n(cot — 8 ), 
with n an integer greater than 3. Therefore we must expect smaller corrections to 
(12.16) with frequencies nco, with n equal to various integers. Any term oscillating 
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(a) (b) 


Figure 12.3 (a) The motion of a DDP with drive strength y — 0.9 (and all other 
parameters the same as in Figure 12.2). After two or three drive cycles, the motion 
settles down to a regular oscillation, which has period equal to the drive period and 
looks at least approximately sinusoidal, (b) The solid curve is an enlargement of a 
single cycle of part (a), from t — 5 to 6. The dashed curve, which is a pure cosine with 
the same frequency, phase, and slope where they cross the axis, shows clearly that the 
actual motion is no longer perfectly sinusoidal; it is appreciably flatter at the extremes. 


with frequency equal to an integer multiple of co is called a harmonic of the drive 
frequency. Thus our conclusion is that, as the drive strength is increased and the 
nonlinearity becomes more important, the pendulum’s motion will pick up various 
harmonics of the drive frequency co, the most important being the n = 3 harmonic 
already included in (12.16). 

The nth harmonic, with frequency nco, is periodic with period x n = 2n/nco = r/n, 
where x = lit/co is the drive period. Thus in one drive period, the nth harmonic repeats 
itself n times. In particular, in one drive period, every harmonic will have cycled back 
to its original value, and a motion that is made up of various harmonics will still be 
periodic with the same period as the driving force. 

The main difference between the motion implied by (12.16) (possibly with other 
harmonics included) and the motion (12.13) of the linear oscillator is that, with its extra 
term (or terms), (12.16) is no longer given by a single cosine function. We should 
be able to see this in a graph of <p(t ) against t, which must deviate slightly from a 
pure sinusoidal shape. However, in the regime we are considering, the coefficient B 
in (12.16) and the coefficients of any higher harmonics are all much smaller than 
A, and the difference between the actual motion and a pure cosine is quite hard 
to see. Figure 12.3(a) shows the motion of a damped driven pendulum with drive 
strength y = 0.9 (just below our rough boundary at y = 1 between weak and strong 
drive strengths). Just as in Figure 12.2, the motion has quickly settled down to steady 
oscillation with exactly the period of the driving force. At first glance the curve (after 
about t = 2) appears to be a pure cosine, but on closer examination you may be able 
to convince yourself that it is a little too flat at the crests and troughs. Figure 12.3(b) 
is an enlargement of one cycle of the motion (solid curve), with a superposed pure 
cosine with the same period and phase (dashed curve). This comparison shows clearly 
that the actual motion is no longer a single pure cosine. 7 


7 The flattened shape at the extremes is nicely consistent with (12.16): Provided B and A have 
opposite signs, the second term in (12.16) reduces <p(t) at the crests and troughs and increases it 
near where it crosses the axis. The behavior is also easy to understand physically: At the extremes, 
the restoring torque of gravity (mgL sin </>) is weaker than the linear approximation (mgL<j)), and 
the actual motion is less sharply curved. 
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The behavior of the DDP in the linear and nearly linear regimes can be quickly 
summarized: As the initial transients die out, the motion approaches a unique attractor, 
in which the pendulum oscillates with the same period as the driver. In the linear 
regime (driving strength y <SC 1) this limiting motion is given by a simple cosine 
function with frequency equal to the drive frequency co. In the not-quite-linear regime 
(y somewhat larger, but definitely not much greater than 1), the limiting motion is 
still periodic, with the same period, but it picks up some harmonics and is a sum of 
cosines with frequencies nco as in (12.16). As we shall see in the next section, we have 
only to increase the drive strength a little above y = 1, to encounter some dramatically 
different behavior. 


12.4 The DDP: Approach to Chaos 


Let us now continue increasing the strength of our DDP’s driver. Figure 12.4 shows 
the motion (0 against t) for all the same parameters and initial conditions as in the 
last two figures, except that I have increased the drive strength to y = 1.06, just a 
little above the rough boundary at y = 1 between weak and strong driving. The most 
striking thing about this plot is the dramatic oscillation of the initial transient motion. 
In the first three drive cycles, the pendulum swings from 0 = 0 to nearly 5n ; that is, 
it makes nearly two and a half counterclockwise rotations. In the next two cycles it 
swings back nearly to 0 = tc and eventually settles down to more-or-less sinusoidal 
oscillations around 0^2 n. (The position 0 = 2 it is, of course, the same as 0 = 0, but 
the statement that 0 eventually centers on 2n is nonetheless meaningful, indicating 
that the pendulum has made one net counterclockwise rotation since t = 0.) 

It is impossible to be completely sure, based on a graph such as Figure 12.4, that 
the eventual motion really is exactly periodic. One way to examine this question 
more closely is to print out the positions 0(f) at successive one-cycle intervals, 
t = t Q , t 0 + 1, t 0 + 2, t 0 + 3, • • -. The larger we choose t 0 , the more closely these 
positions should agree with one another (if, the eventual motion is indeed periodic). 
For example, starting at t = 34 the positions 0(f) (as given by the same numerical 
solution on which Figure 12.4 was based) are 



Figure 12.4 The motion of a DDP with drive strength y = 1.06. The 
initial, rather wild, transients die out after about 9 drive cycles, and the 
motion settles down to an attractor with the same period as the driver. 
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t 0(0 

34 6.0366 

35 6.0367 

36 6.0366 

37 6.0366 

38 6.0366 

39 6.0366 

with all subsequent values equal to 6.0366. Evidently, to five significant figures, the 
motion has settled down to be perfectly periodic after 35 drive cycles. 8 Of course, 
it is possible that the motion does something nonperiodic in between the integer 
times shown, and certainly no one would accept our data as mathematical proof. 
Nevertheless, the evidence is overwhelming that for y = 1.06 (and with the initial 
conditions used) 4>(t) does approach an attractor that is periodic with the same period 
as the driver. In this respect, the motion shown for y = 1.06 is not much different 
from that for y = 0.9 as shown in Figure 12.3. However, the dramatic initial swings 
in Figure 12.4 are harbingers of interesting developments to come. 

Period Two 

Figure 12.5(a) corresponds exactly to the previous figure except that I have now 
increased the drive strength to y = 1.073. Again the most obvious feature is the wild 
initial oscillation, which now lasts for nearly 20 drive cycles before the motion settles 
down to steady oscillations that are at least approximately sinusoidal. However, if you 
look closely at these oscillations, you will notice that the crests and troughs (especially 
the troughs) are not all of the same height. Figure 12.5(b) is a many-fold enlargement 
of the same troughs between t = 20 and 30, and you can see clearly that the troughs 
alternate between two distinct heights. You might wonder if this alternation is itself 
a transient that will disappear after enough cycles, but this is not in fact so. A plot of 
the oscillations for 990 < t < 1000 looks exactly the same as that for 20 < t < 30. 
Another way to show this is to print out the numerical values of 4>{t) at one-cycle 
intervals. Starting at t = 30, this yields 


t m 


30 

-6.6438 

31 

-6.4090 

32 

-6.6438 

33 

-6.4090 

34 

-6.6438 

35 

-6.4090 


8 Naturally, the motion takes longer to settle down to a constant if we insist on more significant 
figures. For example, it is not until 46 cycles have passed that <p (t) starts repeating to six significant 
figures, after which = 6.03662 for t = 46, 47, 48, • • •. 
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Figure 12.5 (a) The first 30 cycles of a DDP with drive strength y = 1.073. 
The wild initial oscillations persist for nearly 20 drive cycles, after which 
the motion settles down to an attractor that is approximately sinusoidal. 
However, closer inspection shows that the crests and troughs of this attractor 
are not all of the same height, (b) An enlargement of the attractor for 
20 < t < 30 showing just the troughs of part (a). The troughs alternate in 
height, repeating themselves once every two drive cycles. 


a pattern that repeats precisely forever. Evidently, by t = 30, the motion has settled 
down so that 0(f) has the value —6.6438 (to 5 significant figures) for all even values 
of t and has the distinct value —6.4090 for all odd values of t. 

This behavior means that the motion no longer repeats itself with the frequency of 
the driver. Rather, the motion is periodic with period equal to twice the drive period, 
and we say that the motion has period two. (In our units this last statement is literally 
true; in general it means that the period of the motion is two times the drive period.) It 
is important to recognise that this development is quite different from the appearance 
of the harmonics that we noticed in the case of nearly linear motion. A harmonic has 
frequency nco, an integer multiple of the drive frequency, and hence period equal to 
an integer submultiple of the drive period. What we have now found has period equal 
to an integer multiple of the drive period, and hence frequency co/n, which can be 
described as a subharmonic of the drive frequency. Looking at Figure 12.5(a) you can 
see that the motion is still very nearly sinusoidal with the period of the driver (period 
1). Thus the dominant term in 0(0 is still of the form A cos (cut — 8); nevertheless, 
0(0 definitely contains a small subharmonic term with period 2. 

Period Three 

Although the attractor shown in Figure 12.5 has period two, the dominant behavior 
is still clearly of period one; that is, the new n = 2 subharmonic contributes only a 
small amount to the solution. If we increase the drive strength a little further, we find an 
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y = 1.077 



10 12 14 


Figure 12.6 The motion of a DDP with drive strength y — 1.077. After 
little more than two drive cycles, the motion has settled down to a periodic 
attractor which repeats itself every three drive cycles (for example, the 
troughs come just before t — 5, 8, 11, 14, and so on); therefore, the 
attractor has period three. 


attractor for which the subharmonic is the dominant term. Figure 12.6 shows the first 
15 cycles of the motion of our DDP with the drive strength increased to y = 1.077 (and 
all other parameters the same as before). In this case it is obvious at just a glance that 
the motion settles down to an attractor that repeats itself every three drive cycles and 
hence has period three. While it would be hard to question that this graph has period 
three, we can reinforce the conclusion by looking at the values of (p(t) at one-cycle 
intervals. Starting from t — 30 these are as follows: 

t 0(0 


30 

13.81225 

31 

7.75854 

32 

6.87265 

33 

13.81225 

34 

7.75854 

35 

6.87265 

36 

13.81225 

37 

7.75854 

38 

6.87265 


with exactly the same pattern, repeating once every three drive cycles, continuing 
indefinitely. Evidently the solution has picked up a period-three term, and this term 
dominates the solution. 


More than One Attractor 

For a linear oscillator, with a given set of parameters, we proved in Section 5.5 that 
there is a unique attractor; that is, whatever the initial values of 0 and 0, the eventual 
motion will always be the same, once the transients have died out. For a nonlinear 
oscillator, this is not the case, and the DDP with the drive strength y — 1.077 of Figure 
12.6 furnishes a clear example. In Figure 12.7,1 have shown the motion for a DDP 
with the same parameters as in Figure 12.6 (including the same drive strength), but 
with two different sets of initial conditions. The dashed curve is the same solution as 
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Figure 12.7 Two solutions for the same DDP, with the same drive strengths, 
but different initial conditions [0(0) = 0(0) = 0 for the dashed curve, but 
0(0) = — rr/2 and 0(0) = 0 for the solid curve]. Even after the transients 
have died out, the two motions are totally different. 


in Figure 12.6, with the same initial conditions as we have used for every graph up to 
now, 0(0) = 0(0) = 0. The solid curve shows the motion of the exact same DDP, also 
with 0(0) = 0, but with 0(0) = — jt/2; that is, for the solid curve, the pendulum was 
released from 90° on the left. As you can clearly see, the two attractors (the curves to 
which the actual motions converge as the initial transients die out) are totally different. 
For the dashed curve, the attractor has period three, for the solid curve the eventual 
period is (as you can see if you look closely) actually two, with alternate troughs (and 
alternate crests) having slightly different heights. Evidently, for a nonlinear oscillator, 
different initial conditions can lead to totally different attractors. 


A Period-Doubling Cascade 

Having recognized that different initial conditions can lead to different attractors, we 
must anticipate that the evolution of the oscillations as we vary y may depend on 
the initial conditions that we choose. In the sequence of Figures 12.2 through 12.6, 
I used the initial conditions 0(0) = 0 and 0(0) = 0 for all five pictures. It turns out 
that the new initial conditions 0 (0) = —tc/2 and 0(0) = 0, introduced in Figure 12.7, 
lead to a quite different and very interesting evolution. In Figure 12.8,1 have shown 
the motion of the DDP for four successively larger values of y, all with these new 
initial conditions. The left-hand pictures show 0(t) as a function of t for the first ten 
cycles of the driver. The first graph is for y = 1.06, the same value as was used in 
Figure 12.4, and, as in Figure 12.4, the motion settles down to a steady oscillation 
with period equal to the drive period; that is, the attractor has period one. To confirm 
this conclusion, the right-hand picture shows the same motion, but for 28 < t < 40 
(by which time any initial transients have completely disappeared at the scale shown), 
with the vertical scale magnified to show clearly that successive oscillations are all of 
equal amplitude. 

For the second pair of graphs, the drive strength was increased to y = 1.078. At first 
glance, the motion looks very similar to that for y = 1.06, but on closer inspection you 
can see that the maxima and minima are not all of the same height. This is very visible 
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Figure 12.8 A period-doubling cascade. The left-hand pictures show the first 
ten drive cycles of a DDP with successively larger drive strengths, as indicated 
on the left. All other parameters, including the initial conditions 0(0) = —tz/2 
and 0(0) = 0, are the same in all pictures. In each picture on the right I have 
enlarged the bottom of the corresponding motion on the left, to show more clearly 
the differences in extent of successive oscillations; these enlargements show 12 
drive cycles, starting from t = 28, by which time the motion has settled down to 
a perfectly periodic attractor (at least at the scale shown). Each double-headed 
arrow shows one complete cycle of the corresponding motion; the periods of the 
four attractors are clearly seen to be 1, 2, 4, and 8, as indicated. 


in the enlargement on the right where you can see easily that the minima alternate 
between two distinct, fixed heights, so that the attractor now has period two. 

With y = 1.081, as in the third pair, the graph on the left again looks pretty much 
like its two predecessors, and it is hard to be sure just what is going on. One of the 
reasons is that we can’t be sure that ten drive cycles (the number shown on the left) are 
long enough for all transients to have disappeared, but in the right-hand enlargement 
it is quite clear that the minima are alternating among four different values. That is, 
the period has doubled again to period four. 

In the last pair of pictures, with y — 1.0826, it is even harder to be sure what 
is happening in the left-hand picture, but the enlargement on the right makes clear 
that the motion eventually repeats once every eight drive cycles. That is, the attractor 
has period eight. The period-doubling cascade seen in these four pairs of pictures 
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Figure 12.9 A period-doubling cascade in convection of mercury in 
a small convection cell. The plots show the temperature at one fixed 
point in the cell as a function of time, for four successively larger 
temperature gradients as given by the parameter R/R c . 


continues. If we increase the drive strength further, we find motion with period 16, 
then 32, and so on to infinity. 

The period-doubling cascade of Figure 12.8 is a very striking phenomenon, but 
the quantitative differences between the four successive unenlarged graphs are quite 
small. You might guess that to build a driven damped pendulum sufficiently precise to 
observe these subtle differences would be very hard, and this is indeed the case. Nev¬ 
ertheless, such pendulums have been constructed and have been used to observe all 
of the effects described in this chapter, with amazing agreement between theory and 
experiment. 9 Perhaps even more remarkable is that the phenomenon of period dou¬ 
bling is found in many completely different nonlinear systems — electrical circuits, 
chemical reactions, balls bouncing on oscillating surfaces, and many more. In each of 
these systems, there is a “control parameter” that can be varied (the driving strength 
of a DDP, a voltage in an electrical circuit, a flow rate in a chemical reaction). The 


9 For a description of three of the commercially available “chaotic pendulums” see J. A. Black¬ 
burn and G. L. Baker, “A Comparison of Commercial Chaotic Pendulums,” American Journal of 
Physics, Vol. 66, p. 821, (1998). The Daedalon pendulum described there was used to get the data 
shown in Figure 12.32. 
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behavior of the system is monitored as this parameter is varied, and it is found that the 
behavior exhibits period-doubling cascades. Figure 12.9 shows a cascade observed 
by Libchaber et al. 10 in the convection of mercury in a small box whose bottom is 
maintained at a slightly higher temperature than its top. This temperature difference 
is the control parameter and is measured by a number R, called the Rayleigh number. 
When R is very small, heat is conducted up with no convection. Then at a critical tem¬ 
perature difference, R c , steady convection sets in, and, as R is increased still further, 
the convection becomes oscillatory. These oscillations can be observed by measuring 
the temperature at any fixed point in the cell, and Figure 12.9 shows four plots of the 
observed temperature (at one fixed point) against time, for four successively larger 
values of the control parameter R. The period doublings from 1 to 2, from 2 to 4, and 
from 4 to 8 are beautifully clear. 

Not only are period-doubling cascades observed in numerous different systems. 
In a sense that I shall describe directly, the cascades occur in the same way, a 
circumstance referred to as “universality.” 


The Feigenbaum Number and Universality 

Returning to the period-doublings of the DDP, you can see from the values of the 
drive strength y shown in Figure 12.8 that the doublings occur faster and faster as we 
increase y. To make this idea quantitative, we need to examine the threshold values 
of y at which the period actually doubles. For example, looking at the numbers in 
Figure 12.8, it seems clear that somewhere between y = 1.06 and 1.078, there must 
be a value y { where the period changes from 1 to 2. Finding where this threshold (or 
“bifurcation point”) actually occurs is surprisingly hard, but it turns out that (to 5 
significant figures) y, = 1.0663. Similarly, at y 2 = 1.0793, the period changes from 2 
to 4. If we let y n denote the threshold at which the period changes from 2 n ~ 1 to 2”, then 
the first few thresholds y n are as shown in Table 12.1. In the last column of the table, 
I have shown the distances y n — y n _ l between successive thresholds, which, as you 
can see, shrink geometrically, 11 each interval being about one fifth of its predecessor. 

In the late 1970’s, the physicist Mitchell Feigenbaum (bom 1944) showed not only 
that many different nonlinear systems undergo similar period-doubling cascades but 
that the cascades all show the same geometric acceleration; specifically, the intervals 
between the thresholds for the control parameter (the drive strength, in our case) satisfy 


(Vn+l 


' Yn) “ Kb —l) 


(12.17) 


10 Reproduced with permission from A. Libchaber, C. Laroche, and S. Fauve, Journal de 
Physique-Lettres, vol. 43, p. 211 (1982). 

11 A sequence of numbers, a x , a 2 , • - •, is geometric if a n+l — ka n for some fixed number k. If 
k < 1, the geometric sequence goes to zero as n —» oo. 
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Table 12.1 The first four thresholds y n at which the period of the 
DDP [with the initial conditions 0(0) = —7r/2and0(O) = 0] doubles 
from 1 to 2, 2 to 4, 4 to 8, and 8 to 16. The last column shows the 
widths of the intervals between successive thresholds. 


n 

period 

Yn 

interval 

1 

l-> 2 

1.0663 

0.0130 

2 

2 —> 4 

1.0793 

0.0028 

3 

4—> 8 

1.0821 

0.0006 

4 

8^ 16 

1.0827 



where the constant 8 has the same value 

8 = 4.6692016 (12.18) 

for all such systems and is called the Feigenbaum number . 12 It is the widespread 
occurrence of period doubling and the fact that 8 has the same value for so many dif¬ 
ferent systems that has led to the phenomenon of period doubling being characterized 
as universal. I have written the Feigenbaum relation (12.17) with an “approximately 
equal” sign, because, strictly speaking, the relation holds only in the limit that n o o. 
For many systems, however, the relation is a very good approximation for all values 
of n. (See Problems 12.11 and 12.29.) 

The Feigenbaum relation (12.17) implies that the intervals between successive 
thresholds approach zero rapidly, and hence that the thresholds themselves approach 
a finite limit y c . 


(as n oo). 


Therefore, the sequence of thresholds y n satisfies 


(12.19) 


Yi < 72 < • •' < Yn < • • • < Yc 


with infinitely many thresholds squeezed in the rapidly narrowing gap between y n and 
y c . For our DDP, the limit y c is found to be 

y c = 1.0829. (12.20) 

We shall see that beyond the critical value y c , chaos sets in, so the period-doubling 
cascade is called a route to chaos. However, I should emphasize that there are systems 
that exhibit chaos without first going through a period-doubling cascade; that is, the 
period-doubling cascade is just one of several possible routes to chaos. 


: Actually there are two Feigenbaum numbers, and this one is often called Feigenbaum’s delta. 
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12.5 Chaos and Sensitivity to Initial Conditions 


If we increase the drive strength y beyond the critical value y c = 1.0829, then our DDP 
begins to exhibit the behavior that has come to be called “chaos.” Figure 12.10 shows 
the first thirty drive cycles of the DDP with y = 1.105. The pendulum is obviously 
“trying” to oscillate with the period of the driver. Nevertheless, the actual oscillations 
wander around erratically and never repeat themselves exactly. Of course you might 
wonder if I have not given the oscillations time to settle down; perhaps at some later 
time they would converge to a periodic motion. In fact, however, a graph of any time 
interval is just as erratic, but never an exact repetition of any other interval. Even 
though the driving force is perfectly periodic and even after the transients have all 
died out, the long-term motion is definitely nonperiodic. This erratic, nonperiodic 
long-term behavior is one of the defining characteristics of chaos. The other defining 
characteristic is the phenomenon called sensitivity to initial conditions. 

Sensitivity to Initial Conditions 

The issue of sensitivity to initial conditions arises in connection with the following 
questions; Imagine two identical DDP’s, with all parameters the same, but launched at 
t = 0 with slightly different initial conditions. [Perhaps the initial angles (j) (0) differ by 
a fraction of a degree.] As time goes by, do the motions of the two pendulums remain 
nearly the same? Do they perhaps get closer to one another? Or do they diverge and 
become more and more different? 

To make these questions more precise, let us denote the positions of the two 
pendulums by (j> x {t) and </> 2 (0- These two functions satisfy exactly the same equation 
of motion, but have slightly different initial conditions. If now A <p{t) denotes the 
difference between our two solutions, 

A0(O = 02(O-0t(O, (12-21) 

the issue is the time dependence of A<p(t). Does A <p(t) stay more-or-less constant? 
Does it decrease as time goes by? Or does it increase? 

For the linear oscillator discussed in Chapter 5, the answer is that A <p(t) goes 
to zero, since we proved that all solutions of the equation of motion approach the 



Figure12.10 Chaos. The first 30 drive cycles ofthe DDP withy = 1.105 
are erratic and show no signs of periodicity. In fact the oscillations never 
do settle down to a regular periodic motion, and this erratic, nonperiodic 
long-term motion is one of the defining characteristics of chaos. 
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same attractor as t oo. Therefore the difference of any two solutions must approach 
zero. Further, the difference must approach zero exponentially. To see this, recall from 
Equation (5.67) that any solution has the form 

(p(t) = A cos (cot - 8) + C x e r + C 2 e r * (12.22) 

where the cosine term is the same for all solutions, whereas the two decaying expo¬ 
nential terms have coefficients Cj and C 2 that depend on the initial conditions. This 
implies that when we take the difference of two solutions the cosine term drops out 
and we are left with 


A (p(t) = B x e rit + B 2 e r *\ (12.23) 

where the constants B x and B 2 depend on the two sets of initial conditions. The precise 
behavior of this difference depends on the relative sizes of the damping constant ft 
and the natural frequency a> 0 . In all of the examples so far in this chapter I have chosen 
ft — 0.25 co 0 , so that < co Q (the situation called underdamping). In this case, we saw 
in Section 5.4 that the coefficients r { and r 2 have the form —fi ± ico x . Some simple 
algebra then puts (12.23) in the form [compare Equation (5.38)] 

A (pit) = De~ pt cos i(o x t - 8). (12.24) 

That is, A (pit) is the exponential e~ pt times an oscillatory cosine. 

There is a problem in trying to display the time dependence of a function like 
(12.24). The exponential factor decays so fast that one cannot easily show its range 
of values on a conventional graph. For example, with the values I have been using, 
ft = 0.25 (o 0 = 0.75tt = 2.356, after just one drive cycle (t = 1) the exponential factor 
is e~P‘ = e~ 2356 & 0.09, and A (pit) has diminished by an order of magnitude. If we 
wanted to plot A (pit) against t over 10 cycles, say, then A (pit) would shrink by about 
ten orders of magnitude -,<a range that cannot possibly be shown on a simple linear 
plot of A (pit) against t. 

As you probably know, the solution to this problem is to make a logarithmic plot; 
that is, we plot the log of A (pit) against t. Actually since A (pit) can be negative, we 
must plot In | A(pit)\ against t. According to (12.24) this should obey 

In \A(pit)\ = \nD - + In | cos(nV - 8)\. (12.25) 

The first term on the right is a constant, and the second is linear in t with slope —ft. 
The third is a little complicated: Since | cos(o> 1 t — <5)| oscillates between 1 and 0, its 
natural log oscillates between 0 and —oo. Thus a graph of In | A0(t)| against t should 
bounce up and down (going to — oo each time the cosine term vanishes), underneath 
an envelope that decreases linearly with slope — ft. This is clearly visible in Figure 
12.11, which shows a plot of log \A(pit)\ against t for the relatively weak driving 
strength y =0.1, for which the linear approximation is certainly good. [I plotted the 
log to base 10 rather than the natural log, because the former is easier to interpret on 
a graph. Since log(jt) is just the constant In (10) times InOc), this changes none of our 
qualitative predictions.] To plot this graph, I gave the first pendulum the same initial 
conditions as in Figures 12.8 and 12.10; the second pendulum was released with its 
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|A0(r)| 


Figure 12.11 Logarithmic plot of A <p (?), the separation of two identical 
DDP’s, with a weak drive strength y =0.1, that were released with 
initial positions that differ by 0.1 radians (or about 6°). The vertical 
axis on the left shows log | A0(?) |, while that on the right shows | A<p (?) \ 
itself. The picture shows clearly that the maxima of log | A <p (?) | decrease 
linearly, and hence that A0(?) decays exponentially. 


initial position 0.1 radians lower, so that the initial difference was A0 (0) = 0.1 rad, or 
about 6 degrees. The most important feature of the plot is that the successive maxima of 
log | A(f)(t)\ decrease perfectly linearly, confirming that A0(?) decays exponentially, 
dropping by about 10 orders of magnitude in the first ten drive cycles (as you can 
easily check from the graph). 13 

So far we have proved that, in the linear regime, the separation A0(f) of two 
identical DDP’s, launched with different initial conditions, decreases exponentially. 
This has an important practical consequence: In practice, we cannot possibly know the 
initial conditions of any system exactly. Therefore, when we try to predict the future 
behavior of our DDP we must recognize that the initial conditions we use may differ 
a little from the true initial conditions. This means that our predicted motion for t > 0 
may differ from the true motion. But because A0 (?) goes to zero exponentially, we 
can be sure that our error will never be worse than the initial error and will, in fact, 
rapidly approach zero. We can say that the linear oscillator is insensitive to its initial 
conditions. To achieve any prescribed accuracy in our predictions, we have only to 
ascertain the initial condition to this same accuracy. 

What happens as we increase the drive strength y out of the linear regime? 
Naturally, we can no longer depend on our proofs for the linear oscillator. However, we 
can reasonably expect that the difference A0(?) will continue to decay exponentially 
for at least some range of drive strengths. The question is “How large is this range?” 
and the answer is quite surprising: Provided the difference in initial conditions is 
sufficiently small, the difference A 0 (?) continues to decay exponentially for all values 
of y up to the critical value y c at which chaos sets in. For example, if y — 1.07 we 


13 A second noteworthy feature is that the points where | A<j> (?) | vanishes (and, hence, log | A0 (?) | 
goes to — oo) show up on the logarithmic plot as sharp downward spikes. This is because the plot¬ 
ting program can only sample a finite number of points and naturally misses the points where 
log | A0(?)| = —oo. Instead it can only detect that there are points where log |A0(?)| has a pre¬ 
cipitous minimum, and this is what it shows. 
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Figure 12.12 Logarithmic plot of A (j>{t), the separation of two identical 
DDP’s, with drive strength y — 1.07, that were released with initial posi¬ 
tions that differ by 0.1 radians (or about 6°). For the first 15 or so drive cycles, 
A(f> ( t ) holds fairly constant in amplitude, but then the maxima of log | A$ (f) | 
decrease linearly, implying that A (j>{t) decays exponentially. 


know that the motion has period 2 (for the initial conditions of Figures 12.8 through 
12.11), and the motion is distinctly nonlinear; nonetheless, the difference A <p(t) still 
decays exponentially, as is clearly visible in Figure 12.12, which shows the difference 
A0(f) for two solutions with the same initial conditions as in Figure 12.11. In this case, 
A 4>(t) remains pretty well constant in amplitude for the first 15 or 20 drive cycles, 
but then the crests of log | A(f) (t) \ drop perfectly linearly, indicating that A 4>{t) decays 
exponentially as t —> oo. Notice, however, that the exponential decay is considerably 
slower than in the linear case: Here the amplitude drops by about 4 orders of magnitude 
in the last 25 cycles; in the linear case of Figure 12.11, it dropped by 10 orders of 
magnitude in just 10 cycles. Nevertheless, the main point is that A0(f) goes to zero 
exponentially, and, as in the linear regime, we can predict the future behavior of our 
DDP, confident that any uncertainties in our predictions will be not much larger (and 
usually much smaller) than our uncertainty in the initial conditions. 

If we now increase the drive strength past y c = 1.0829 into the chaotic regime, the 
picture changes completely. Figure 12.13 shows A (pit) for the same DDP as in Figures 
12.11 and 12.12, except that the drive strength is now y = 1.105, the same value used 
in our first plot of chaotic motion in Figure 12.10. The most obvious feature of this 
graph is that A0(f) clearly grows with time. In fact, you will notice that to highlight 
the growth of A0(f) I chose the initial difference to be just A0(O) = 0.0001 radians. 
Starting from this tiny value, \A(p\ has increased in 16 drive cycles by more than 4 
orders of magnitude to about | A<f>\ « 3.5. 

From t = 1 through f = 16 (where the graph is leveling off), the maxima in Figure 
12.13 grow almost perfectly linearly, implying that A0 (f) grows exponentially. 14 This 
exponential growth spells disaster for any attempt at accurate prediction of the DDP’s 


14 The eventual leveling of the curve is easily understood. You can see from Figure 12.10 that 
the angle 0(f) [actually, 0j(f), but the same applies to 0 2 (f)] oscillates between about ±jr. That 
is, neither 0j(f) nor <f> 2 (t) ever exceeds magnitude n. Therefore their difference A0(f) can never 
exceed 2n. Thus the curve has to level off before A0(f) reaches 2n. 
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log| A</>(f)| 



Figure 12.13 The separation A 4> (t) of two identical pendulums, both with drive 
strength^ = 1.105 and with an initial separation A0(O) = 10 -4 rad. After a small 
initial drop, the crests of log | A (f> ( t ) | increase linearly, showing that A<p (t) itself 
grows exponentially. 


long-term motion. For the present case, an error as small as 10~ 4 radians in our initial 
conditions will have grown in 16 cycles to an error of about 3.5, or more than n radians. 
Thus an uncertainty of ±10 -4 radians in the initial conditions grows to an uncertainty 
of ± 7 r, and an uncertainty of ±tt in the angle of a pendulum means that we have 
no idea at all where the pendulum is! I chose this example because it is especially 
dramatic. Nevertheless, in any chaotic motion, A0(f) grows exponentially for a while 
at least. Even if this growth levels out before A0(r) reaches n, the exponential 
growth means that a tiny uncertainty in the initial conditions quickly grows into a 
large uncertainty in the predicted motion. It is in this sense that we say chaos exhibits 
extreme sensitivity to initial conditions, and this sensitivity is what can make the 
reliable prediction of chaotic motion a practical impossibility. 


The Liapunov Exponent 

What we have seen in the preceding three examples can be rephrased to say that the 
difference A0(0 between two identical DDP’s released with slightly different initial 
conditions behaves exponentially: 


A<K0 ~ Ke xt (12.26) 

(where the symbol signifies that A <p(t) may oscillate underneath an envelope 
with the advertized behavior, and K is a positive constant). The coefficient X in the 
exponent is called the Liapunov exponent . 15 If the long-term motion is nonchaotic 
(settles down to periodic oscillation) the Liapunov exponent is negative; if the long¬ 
term motion is chaotic (erratic and nonperiodic) the Liapunov exponent is positive. 


15 Strictly speaking there are several Liapunov exponents, of which the one discussed here is the 
largest. 
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Higher Values of y 

So far we have seen that as we increased the drive strength y, the motion of our DDP 
became more and more complicated — from the linear regime, with its pure sinusoidal 
response, to the nearly linear regime, with the addition of harmonics, to the appearance 
of subharmonics and (for certain initial conditions at least) a period-doubling cascade, 
and finally to chaos. You might naturally anticipate that, if we were to increase y still 
further, the chaos would continue and intensify, but as usual our nonlinear system 
defies predictions. As y increases, the DDP actually alternates between intervals of 
chaos separated by intervals of periodic, nonchaotic motion. I shall illustrate this with 
just two examples. 

We saw that with y = 1.105, the DDP exhibits chaotic behavior (Figure 12.10) 
and exponential divergence of neighboring solutions (Figure 12.13). We have only 
to increase the drive strength to y = 1.13 to enter a narrow “window” of nonchaotic 
period-3 oscillation with exponential convergence of neighboring solutions, as shown 
in Figure 12.14. In part (a), you can see that within three drive cycles the motion 
settles into regular period-3 oscillations. Part (b) shows the separation Aof two 
pendulums released with an initial difference A0 (0) of 0.001 radians; in the first eight 
drive cycles, A0(0 actually increases, but from then on it decreases exponentially to 
zero, dropping by six orders of magnitude in the next twelve cycles. 


y= 1.13 



(a) 0(0 vs t (b) log|/10(r)|vs t 


Figure 12.14 Motion of a DDP with y = 1.13. (a) The graph of 0(0 
quickly settles down to oscillations of period 3 (same initial conditions as 
in Figures 12.8 and 12.10). (b) Logarithmic plot of the distance between 
two identical pendulums, the first with the same initial conditions as in 
part (a), the second with its initial angle 0.001 radians lower. After an 
initial modest increase, A0(0 goes to 0 exponentially. 


Figure 12.15 shows the corresponding two graphs for a drive strength of y = 1.503, 
where the motion has returned to being chaotic. 16 In part (a) we see a new kind 
of chaotic motion. The driving force is now strong enough to keep the pendulum 
rolling right over the top, and in the first 18 drive cycles the pendulum makes 13 
complete clockwise rotations [0(0 decreases by 26 jt]. The motion here can be 
seen as a steady rotation at about one revolution per drive cycle, with an erratic 


16 As we shall see, between the values y = 1.13 and 1.503, shown in Figures 12.14 and 12.15, 
the DDP has passed through several intervals of chaotic and nonchaotic motion, but I omit the details 
for now. 
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(a) cf> (t) vs t 



Figure12.15 Motion of a DDP with y = 1.503. (Same initial conditions 
as in Figure 12.14.) (a) The graph of <p(t) against t oscillates erratically. 
In the first 18 cycles, it plunges to about —26 ir, that is, it makes about 
13 complete clockwise rotations. It then starts to climb back but never 
actually repeats itself, (b) Logarthmic plot of the distance between two 
identical pendulums with an initial separation of 0.001 radians. For the 
first nine or ten cycles, A </>(?) grows exponentially and then levels out. 


oscillation superposed. 17 At t = 18, the motion reverses to a more-or-less steady 
counterclockwise rotation with erratic oscillations superposed, and, as the picture 
suggests, the motion never settles down to be periodic. 

The logarithmic plot of Figure 12.15(b) shows the divergence of two pendulums 
with the same drive strength y = 1.503, but an initial separation of 0.001 radians. The 
separation of the two pendulums increases exponentially for the first 9 or 10 cycles 
and levels out by about t = 15. A dramatic feature of this divergence is that it is big 
enough to be seen in the conventional linear plot of Figure 12.16, which shows the 
actual positions (f>x(t) and </> 2 (0 °f the two pendulums. At first sight it is perhaps 
surprising that for the first 8.5 cycles the two curves are completely indistinguishable, 
but that the difference is then so abundantly visible. However, you can understand 
this striking behavior by reference to Figure 12.15(b), where you can see that until 
t sw 8.5 the separation A</>(f), although growing rapidly, is nevertheless always less 
than about 1/3 radian — too small to be seen on the scale of Figure 12.16. By the time 
t & 9.5, A <p (t) has reached about 3 — which is easily visible on the linear plot — and 
is still climbing rapidly. Thus from t ~ 9.5 the two curves are completely distinct. 

The main morals to be drawn from these last two examples are these: (1) Once the 
drive strength y of our DDP is past the critical value y c = 1.0829, there are intervals 
where the motion is chaotic and others where it is not. These intervals are often quite 
narrow, so that the chaotic motion comes and goes with startling rapidity. (2) The chaos 
can take on several different forms, such as the erratic “rolling” motion of Figure 


17 If you look closely you can see that for the first 7 cycles, the motion is very close to being 
a steady rotation of — 2n per cycle, with a regular period-1 oscillation superposed. This type of 
motion is actually periodic, since a change of — 2n brings the pendulum back to the same place. For 
some values of the drive strength, the long-term motion settles down exactly this way, a phenomenon 
called phase locking. (See Problem 12.17.) 
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7= 1.503 



Figure 12.16 Linear plot of the positions of the same two identi¬ 
cal DDP’s whose separation A<p(t) was shown in Figure 12.15(b) 
[A0(O) = 0.001 rad]. For the first eight and a half drive cycles, the 
two curves are indistinguishable; after this the difference is dramati¬ 
cally apparent. 


12.15(a). (3) The erratic motion of chaos always goes along with the sensitivity to 
initial conditions associated with the exponential divergence of neighboring solutions 
of the equation of motion. 


12.6 Bifurcation Diagrams 


So far, each of our pictures of the motion of the driven damped pendulum has shown 
the motion for one particular value of the drive strength y. To observe the evolution 
of the motion as y changes, we have had to draw several different plots, one for each 
value of y . One would like to construct a single plot that somehow displayed the whole 
story, with its changing periods and its alternating periodicity and chaos as y varies. 
This is the purpose of the bifurcation diagram. 

A bifurcation diagram is a cunningly constructed plot of 0(f) against y as in 
Figure 12.17. Perhaps the best way to explain what this plot shows is just to describe 
in detail how it was made. Having decided on a range of values of y to display (from 
y = 1.06 to 1.087 in Figure 12.17) one must first choose a large number of values of 
y, evenly spaced across the chosen range. For Figure 12.17,1 chose 271 values of y, 
spaced at intervals of 0.0001, 

y = 1.0600, 1.0601, 1.0602, • • •, 1.0869, 1.0870. 

For each chosen value of y, the next step is to solve numerically the equation of motion 
(12.11) from t = 0 to a time f max picked so that all transients have long since died out. 
To make Figure 12.17,1 chose the same initial conditions as in the last few pictures, 
namely 0(0) = -tt/2 and 0(0) = 0. 18 


18 Some authors like to superpose the plots for several different initial conditions. This gives a 
more complete picture of the many possible motions, but makes the plot harder to interpret. For 
simplicity, I chose to use just one set of initial conditions. 
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Figure 12.17 Bifurcation diagram for the driven damped pendulum for drive 
strengths 1.060 < y < 1.087. The period-doubling cascade is clearly visible: At 
y x = 1.0663 the period changes from 1 to 2, and at y 2 = 1.0793 from 2 to 4. The 
next bifurcation, from period 4 to 8, is easily seen at y 3 = 1.0821, and that from 8 to 
16 is just discemable at y 4 = 1.0827. To the right of the critical value, y c — 1.0829, 
the motion is mostly chaotic, although at y = 1.0845 you can just make out a brief 
interval of period-6 motion. 


To understand our next move, recall that a good way to check for periodicity (or 
non-periodicity) is to examine the values 

0(0’ 0(*o + 1)’ + 2), 

of 0(f) for a large number of times at one-cycle intervals. If the motion is periodic 
with period n, these will repeat themselves after n cycles, otherwise not. Therefore, 
our next step is to use our solutions for 0(f) to find the values of 0(f) over a range of 
times at integer intervals from some chosen f min to f max (with f min large enough that 
all transients have died out). For Figure 12.17,1 found 0(f) for 100 times, 


t = 501, 502, • • •, 600. 


(Since this had to be done for 271 different values of y, there were in all 271 x 100 or 
nearly 30,000 calculations to do, and the whole process took several hours.) Finally, 
for each value of y, these hundred values of 0(f) were drawn as dots on the plot of 0 
against y. To see what this accomplishes, consider first a value of y such as y = 1.065 
where we know that the motion has period 1. With the period equal to 1, the hundred 
successive values of 0(f) are all the same, and the 100 dots all land at the same place in 
the plot of 0 against y. Thus what we see at any y for which the period is 1 is a single 
dot. From y = 1.06 till the threshold value y { = 1.0663 where the period doubles, our 
plot is therefore a single curve. 

At the threshold y 1 = 1.0663, the period changes to 2, and the positions 


0(501), 0(502), 0(503), •••, 0(600) 
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now alternate between two different values. Therefore, these 100 points actually create 
exactly two distinct dots on the plot, and the single curve bifurcates at y 1 into two 
curves. At y 2 = 1.0793 the period doubles again, to period 4, and each of the two 
curves bifurcates, giving four curves in all. The next doubling, to period 8, is easily 
seen (though I have not actually indicated it on the picture), and if you look closely you 
can just pick out some of the bifurcations to period 16. After this, the graph becomes 
a nearly solid confusion of points, and it is impossible to tell (from the graph, at least) 
the exact value y c where chaos begins, though it is clearly somewhere just below 
y = 1.083. Beyond this point, for the remainder of Figure 12.17, the motion is mostly 
chaotic, though you can see a small window at y = 1.0845, indicated by a vertical 
dashed line. (The window is especially noticeable in the upper section of the plot, 
where the dots are otherwise denser.) If you hold a ruler to this vertical line, you 
will see that at this particular value of y there are just six distinct points. That is, at 
y = 1.0845, the motion has returned briefly to being periodic, this time with period 6. 

A Larger View 

Figure 12.17 shows a rather small range of drive strengths (1.06 < y < 1.087) in great 
detail. Before we examine a larger ranges of drive strengths, we must cope with 
one small complication. As y increases, we have seen that the pendulum can start 
a “rolling” motion in which it makes many complete revolutions. In some cases, it 
can continue to “roll” indefinitely, so that 4>(t) eventually approaches ±oo. Even if 
this rolling motion is perfectly periodic, the successive values 

HQ, 0(r o +l), 0(* o + 2), ••• 

never repeat themselves, since they increase by a multiple of 2n in each cycle. This 
renders a bifurcation plot, drawn exactly as in Figure 12.17, useless. The most obvious 
way to get around this difficulty is to redefine 0 so that it always lies in the range 


— 7t < 0 < 7T. 


Each time 0 increases past it, we subtract 2 tt, and each time it decreases past — 7t, 
we add 2n. With this modification, we can now draw a bifurcation diagram as before. 
However, keeping 0 between ±7t in this way has the disadvantage that it introduces 
a meaningless discontinuous jump into 0(t), each time it passes ±7t. 

A second, and sometimes simpler, way around the problem of the 27r-ambiguity 
in 0 is to plot the values of the angular velocity 

HQ, 0(? o + 1), 0('o + 2), • • • (12.27) 

instead of the angular position 0(t o ), • • •. The angular velocity 0 is immune to the 
27T-ambiguity of 0 (since 0 is unaffacted by the addition of any multiple of 2tt to 
0). Thus, if the motion is periodic with period n, then the values (12.27) will repeat 
themselves after n cycles, and otherwise not. Therefore, a bifurcation diagram drawn 
using the values of 0 instead of 0 will work just like Figure 12.17, even if the pendulum 
undergoes a rolling motion. 
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Figure 12.18 Bifurcation diagram showing values of <p for the DDP with drive 
strengths 1.03 < y < 1.53. The intervals labelled a,b, ■ ■ ■, f across the top are as 
follows: (a) This interval is the same as was shown in much greater detail in Fig¬ 
ure 12.17. It starts with period 1, followed by a period-doubling cascade leading to 
chaos, (b) Mostly chaos, (c) Period 3. (d) Mostly chaos, (e) Period 1, followed by 
another period-doubling cascade, (f) Mostly chaos. 


Figure 12.18 is a bifurcation diagram drawn using values of <p over a range from 
just above y = 1.0 to just above y = 1.5. The first part of this picture, labelled (a) at 
the top, is the interval that was shown in much greater detail in Figure 12.17, with 
a period-doubling cascade that starts from period 1 and ends in chaos. Section (b) 
is mostly chaos, although we already know that it contains some narrow windows of 
periodicity (most of which are completely hidden at the scale used here). Section (c) is 
very clearly period 3 and includes the value f = 1.13 that was shown in Figure 12.14. 
Section (d) is mostly chaos, while (e) starts with a long stretch of period 1, followed 
by another period-doubling cascade. Finally, section (f) is mostly chaos, although you 
can just pick out some windows of periodicity. 



Figure 12.19 Motion of the DDP with drive strength y = 1.4. (a) The plot of 0(f) against 
t shows a periodic rolling motion in which (p decreases by 2n in each drive cycle, (b) The 
plot of angular velocity <p(t) against t shows even more clearly that after about two drive 
cycles the motion becomes periodic, with (pit) returning to the same value once each cycle. 
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A striking feature of Figure 12.18 is the long interval of period-1 motion, from 
just below y = 1.3 to just above y = 1.4. This period-1 motion is actually a rolling 
motion, as you can see in Figure 12.19 which shows the motion for y = 1.4. In part 
(a), which shows <p{t) as a function of t, you can see that the pendulum is rolling 
clockwise at a rate of one complete revolution per drive cycle (0 decreases by 207r in 
10 cycles). That the motion really is periodic is even more evident in part (b), which 
shows <p(t) as a function of t. After about two drive cycles, 0(f) is clearly periodic 
with period 1. 


12.7 State-Space Orbits 


In the next two sections I give a brief introduction to the Poincare section, which is 
an important alternative way to view the motion of chaotic (and nonchaotic) systems. 
The Poincare section is a simplification of the so-called state-space orbit. This sim¬ 
plification is especially helpful for complicated multidimensional systems but can be 
introduced in the context of our one-dimensional driven damped pendulum. Thus I 
shall start in this section by describing state-space orbits for the DDP. 

In our discussion of the DDP we have focussed almost exclusively on the position 
0 it) as a function of t. It turns out, however, that it is sometimes an advantage to follow 
both the position (pit) and the angular velocity 0(0 as time evolves. In principle, if 
one knows 0 ( t ) for all t, then one can calculate (pit) by straightforward differentiation. 
Thus to follow 0(0 as well as 0 (f) is, in this sense, redundant. Nevertheless, following 
both variables can provide new insights into the motion, and this is what we shall now 
discuss. 

There is an immediate problem in plotting the two variables 0 (0 and 0(0 as func¬ 
tions of the third variable t, since this requires a three-dimensional plot — something 
which is hard to make, and not especially illuminating when made. The usual proce¬ 
dure is to draw the pair of values [0(0, 0(0] as a point in a two-dimensional plane 
where the horizontal axis labels 0 and the vertical axis 0. (For reasons I’ll discuss in 
a moment, this plane with coordinates 0 and 0 is called state space.) As time passes, 
the point [0(0, 0(01 moves in this two-dimensional space and traces out a curve, 
which is called a state-space orbit (or phase-space trajectory). Once you get used to 
interpreting these state-space orbits, you will find that they give a rather clear picture 
of the system’s motion. 

As a first example, let us consider a DDP with y = 0.6 (a drive strength for which 
the linear approximation is still fairly good) and with our favorite initial conditions 
0(0) = —7r/2 and 0(0) = 0. Figure 12.20 shows a conventional plot of (pit) against 
t for this case. To interpret this picture one has to know (as you certainly do) that the 
changing position 0(f) is shown by the vertical displacement of the graph while the 
time t advances from left to right. With this understood, you can clearly see the motion 
starting at t — 0 with 0(0) = — 7t/2 and quickly approaching the expected sinusoidal 
attractor, with (pit) of the form (pit) — A cos(ajf — 5). 

Figure 12.21 shows the state-space orbit for the same DDP with the same initial 
conditions. Part (a) shows the first twenty cycles, 0 < t < 20. To interpret this picture, 
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Figure 12.20 Conventional plot of 0(f) against t for a DDP with drive 
strength y = 0.6. The motion quickly settles down to almost perfectly 
sinusoidal oscillation. 




Figure 12.21 State-space orbit for a DDP with drive strength y — 0.6. State 
space is the two-dimensional plane with coordinates 0 and 0; the state-space 
orbit is just the path traced by the point [0(0,0(0] as time passes, (a) The first 
20 cycles, starting from the initial values 0(0) = —n/2 and 0(0) = 0. The three 
dots labelled 0, 1, and 2 show the positions of [0(0,0(0] at t — 0, 1, and 2. 
The orbit spirals inward and rapidly approaches the period-one attractor, which 
appears as an ellipse in state space, (b) The same as (a) but with the first 5 cycles 
omitted so that only the elliptical attractor is seen. Between the times 5 < t < 20, 
the point [0(0, 0(0] moves 15 times around the same elliptical path. 


one has to know that as t advances the curve is traced, in the direction of the arrows, 
by the pair [ 0 ( 0 , 0 ( 0 ]- With this understood, you can see clearly that the orbit starts 
out from 0(0) = — tt/2 and 0(0) = 0. Since the initial acceleration 0 is positive, 19 
0(0 increases from the outset, and 0(f) begins to increase as soon as 0 is nonzero. 
Thus the point [0(f), 0(f)] moves up initially, curving to the right. The oscillation of 
0(f) is evidenced by the back and forth, left-right, motion of the orbit; the oscillation 
of 0(f) by the up and down vertical motion. Eventually, as the transients die out, the 
motion approaches its long-term attractor, in which (in the linear approximation) we 
know that 0(f) has the form 


0(f) = Acos(cot — 8). 


(12.28) 


19 As you can easily check, with the given initial conditions, both gravity and the drive force 
give the pendulum a positive acceleration at first. 
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Figure 12.22 State-space orbit for a DDP with drive strength y = 0.6 and initial 
conditions 0(0) = 0(0) = 0. (a) The first 20 cycles, starting from the origin and 
spiralling out toward the elliptical attractor, (b) In the 15 cycles 5 < t < 20, the 
orbit moves 15 times around the elliptical attractor to give exactly the same picture 
as in Figure 12.21(b). 


This implies that the angular velocity 0(f) approaches the form 

0(f) = — co A sin (cut — 8). (12.29) 

The two equations (12.28) and (12.29) are the parametric equations for an ellipse 
drawn clockwise in the plane of (0,0), with semimajor and semiminor axes A and 
coA . Thus, once the transients have died out, the point [0(f), 0(f)] moves around this 
ellipse with angular frequency equal to the drive frequency a>; that is, the state-space 
orbit completes one revolution per drive cycle. In Figure 12.21(a), the state-space 
orbit spirals in toward this ellipse, merging with it after about three cycles. [This 
already illustrates one small advantage of the state-space orbit over the conventional 
plot of 0(f) against f: In the conventional Figure 12.20 the actual motion has become 
indistinguishable from the limiting sinusoidal motion after little more than 1 cycle; in 
the state-space plot of Figure 12.21(a) the actual and limiting orbits can be told apart 
for some three cycles. Thus the state-space orbit gives a more sensitive picture of the 
approach to the attractor.] Figure 12.21(b) is the same as part (a), except that I have 
omitted the first 5 cycles; that is, in part (b), 5 < f < 20 and only the elliptical attractor 
shows up. Since our main interest is usually in the limiting motion, state-space plots 
are usually drawn as in part (b), with enough initial cycles omitted so that only the 
limiting motion is visible. 

Figure 12.22 shows the state-space orbit for exactly the same DDP as in Figure 
12.21, but with initial conditions 0(0) = 0(0) = 0. In part (a) you can easily see that 
the orbit starts out with the stated initial conditions, and spirals outward, completing 
some 2.5 cycles before merging with the elliptical attractor. Part (b) shows the 15 
cycles starting from f = 5, by which time the orbit is indistinguishable from its long¬ 
term attractor. In particular, Figure 12.22(b) is exactly the same as Figure 12.21(b), 
because for y = 0.6 all initial conditions lead to the same attractor. 
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State Space 

I shall give a detailed discussion of state space in Chapter 13, but here is a brief expla¬ 
nation of the terminology. For our pendulum, state space (also called phase space ) is 
the two-dimensional plane defined by the two variables 0, the angular position, and 
0, the angular velocity. This is to be contrasted with the one-dimensional configura¬ 
tion space defined by the one variable 0 that gives the position, or configuration of 
the system. More generally, the configuration space of an n-dimensional mechanical 
system is the n-dimensional space of its n position coordinates q x , ■ • •, q n , whereas 
state space is the In -dimensional space of the coordinates q h ■ ■ •, q n and velocities 
q x , ■ ■ ■, q n . I shall discuss several properties and uses of state space in Chapter 13. 
Here I mention just one important feature: The “state” (or “state of motion” in full) 
of a mechanical system is often used to mean a specification of the motion (at any 
chosen time t 0 ) that is complete enough to determine uniquely the motion at all later 
times. That is, the state of a system defines the initial conditions needed to specify 
a unique solution of the equation of motion. For our pendulum, specification of the 
position 0 at time t 0 is not sufficient to determine a unique solution, but specification 
of 0 and 0 is. That is, the two variables 0 and 0 define the state of the pendulum, and 
the space of all pairs (0, 0) is naturally called state space. 

A state-space orbit is simply the path traced in state space by the pair [0 (r), (pit)] 
as time evolves. Natural as this name is, you must recognise that a state-space orbit 
is very different from the orbit of, say, a planet in ordinary space with coordinates 
r = (x, y, z). For example, a planet can have many different orbits passing through a 
single point r at a given time t 0 . On the other hand, from what was just said about initial 
conditions, it follows that for any “point” (0,0) in state space, our pendulum has 
exactly one state-space orbit passing through (0, 0) at any given t 0 . Another curious 
feature of state-space orbits concerns their direction of flow: Since the vertical axis 
represents the velocity 0, the motion at any point above the horizontal axis (0 > 0) 
is always to the right (increasing 0), as seen in Figure 12.22. Similarly, the motion 
at any point below the horizontal axis has to be to the left. If an orbit crosses the 
horizontal axis then, since 0 = 0, the orbit must be moving exactly vertically (0 not 
changing). All of these properties are illustrated in Figure 12.22. They imply that any 
closed state-space orbit, such as the elliptical attractor of Figure 12.22(b), is always 
traced in a clockwise direction. 


More State-Space Orbits 

As we increase the drive strength y , we know that the motion of our DDP undergoes 
various dramatic changes, some of which show up very nicely in plots of the state- 
space orbits. For instance, Figure 12.23 shows the state-space orbits for y = 1.078 and 
y = 1.081, both in the middle of the period-doubling cascade first shown in Figure 
12.8. Both plots show forty cycles starting from t — 20, by which time all initial 
transients have completely died out. That is, both plots show the limiting, long-term 
motion and are to be compared with Figure 12.22(b). As in that picture, these new 
orbits move around the origin in more-or-less elliptical loops, but in both cases the 
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Figure 12.23 State-space orbits showing the periodic attractors for (a) y — 
1.078 with period 2 and (b) y = 1.081 with period 4. Both plots show the forty 
cycles from t = 20 to 60. In part (a), the orbit traces just two distinct loops 
twenty times each; in part (b) it traces four loops ten times each. Compare 
Figure 12.8 (middle two lines). 


orbit makes more than one loop before closing on itself. In part (a) there are two 
distinct loops, each of which lasts for one drive cycle, so that the motion repeats itself 
once every two cycles — that is, it has period two. In part (b) there are four distinct 
loops, indicating very clearly that the period has doubled again to period four. It is 
important to understand that it makes no difference how many cycles we plot in these 
two figures [as long as we start after the transients have died out, and plot at least two 
cycles in part (a) and four in part (b)]. I could have plotted from t = 20 to 100 or from 
20 to 1000, and part (a) would still have shown the same two loops and part (b) the 
same four loops. 


Chaos 

If we increase the drive strength y a little further, we enter a region of chaos. Figure 
12.24 shows the state-space orbit for drive strength y = 1.105, whose chaotic character 
was shown in Figures 12.10 and 12.13. Part (a) shows seven cycles from t = 14 to 
t — 21, and you can see clearly that in seven cycles the orbit fails to repeat or to 
close on itself. Thus if the motion is periodic, its period must be greater than 7. To 
decide whether it is periodic, we need to plot more cycles. In part (b), I have plotted 
from t = 14 to 200, and the plot has become an almost solid swath of black but 
has still not repeated itself. [The evidence for this last claim is that in a plot out to 
t = 400 (not shown) the curve moves into several of the remaining gaps of part (b); 
thus it has certainly not begun to repeat by t = 200.] Therefore, Figure 12.24 adds 
strong support to our conclusion that the motion never repeats itself and is in fact 
chaotic. 

The black swath of Figure 12.24(b) is very striking, but is too full of information 
to be of much use. We need a way to extract from this picture a smaller amount of 
information that might actually tell us more. The technique for doing this is the so- 
called Poincare section, but before we take this up, I want to give two more examples 
of state-space orbits. 
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(b) 28 < f < 200 


Figure 12.24 State-space orbits for a DDP with y = 1.105 showing the 
chaotic attractor, (a) In the seven cycles from t — 14 to 21 the orbit does 
not close on itself, (b) The same is true in the 186 cycles from t — 14 to 
200, and by now it is pretty clear that the motion is never going to repeat 
itself and is in fact chaotic. 


State-Space Orbits for Rolling Motion 

We have already seen that for y = 1.4 our DDP executes a “rolling motion,” making 
a complete clockwise rotation once each drive cycle (Figure 12.19). The state-space 
orbit for this motion is shown in Figure 12.25. In this plot you can see clearly how, 
after a couple of cycles, the pendulum settles down to a periodic motion in which 
0 decreases by 27r and the pendulum makes a complete clockwise rotation once per 
cycle. 

The plot of Figure 12.25 is a very satisfactory way of showing the state-space 
orbit over a small number of cycles. Sometimes, however, (if the motion is chaotic, 
for instance) one would like to show the orbit over a very long time interval — several 
hundred cycles, perhaps — and in this case 0 may range over many hundreds of 
complete revolutions. To show this, in the format of Figure 12.25, we would be forced 
to compress the scale on the 0 axis to the point where the motion would be completely 
indecipherable. The usual way around this difficulty is to redefine 0 so that it always 



o<t<6 


Figure 12.25 First six cycles of the state-space orbit for a DDP with 
y = 1.4, showing the periodic rolling motion, in which 0 decreases by lit 
in each cycle. The numbers 0, 1, • • •, 6 indicate the state-space “position” 
(0,0) at the times t = 0, 1, • • • , 6. 
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Figure 12.26 (a) The exact same orbit as in Figure 12.25 but with 0 redefined 
so that it remains between — n and n. Each time 0 decreases to — tt (at A for 
example), the orbit disappears and reappears at +ir (at B for example), (b) By 
the time t = 10 the orbit has settled down to perfectly periodic motion, with 
each successive cycle lying exactly (on this scale) on top of its predecessor. 



lies between — tt and n: Each time 0 decreases past — tt we add on 2jv and each 
time it increases past tt we subtract 2 tt. (This is acceptable since any two values 
of 0 that differ by a multiple of 2n represent the same position of the pendulum.) 
With 0 redefined in this way, the state-space orbit of Figure 12.25 looks as shown 
in Figure 12.26(a). This new plot is not an obvious improvement on Figure 12.25 
(though we shall see that it does have some advantages), but you should study it 
carefully to understand the relationship of the two kinds of picture. You can think of 
the new picture as being obtained from Figure 12.25 by cutting apart the intervals 
— 3n < 0 < — 7i, and — 5tt < 0 < — 37T, and so on, and pasting them all back on top 
of the interval — tt <0 < n. In the resulting picture, 0 makes a discontinuous jump 
each time it arrives at 0 = ±tt. For example, at about t = 0.7, 0 decreases to — tt at 
the point A and jumps to the point B. 

An advantage of Figure 12.26(a) over Figure 12.25 is that the new picture gives 
a more incisive test of the periodicity of the orbit. In the new picture, you can see 
that the orbit is approaching a periodic attractor, but it is also clear that in the interval 
0 < t < 6 the orbit has definitely not reached the periodic attractor. (In fact you can 
just about see that there are 6 distinct loops.) On the other hand, by the time t = 10 
the successive cycles are indistinguishable on the scale of these pictures. The twenty 
cycles shown in Figure 12.26(b) all disappear on the left at the same point C, reappear 
at D, and follow the exact same path back to C twenty times over. 

A disadvantage of either plot in Figure 12.26 is the spurious discontinuity each 
time 0 jumps from — n to tt, as at points A and B, for example. We can get rid of 
these discontinuities (at least in our minds) if we imagine the plot cut out and rolled 
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into a cylinder with the vertical lines 0 = ±jt glued together. In this way point A 
becomes the same as point B, and the state-space orbit moves continuously around 
the vertical cylinder. 


More Chaos 

As a final example of a state-space orbit, I have shown the orbit for a DDP with 
y = 1.5 in Figure 12.27. We already know that the motion is chaotic for this value of 
y, though for this picture I chose a smaller damping constant, yS = coj 8 instead of the 
value f3 = cd 0 /4 that I used for all previous pictures in this chapter. As it turns out, this 
smaller damping makes the chaotic motion more wild and produces a Poincare section 
that is even more interesting and elegant (as I describe in the next section). With these 
parameters, the pendulum undergoes an erratic rolling motion, making many complete 
revolutions, first in one direction and then in the other. Thus we are forced to make a 
plot with 0 confined between — n and rt as in Figure 12.26 — but with dramatically 
different results. The motion does not repeat itself in the 190 cycles shown, with 
10 < t < 200. (The evidence for this claim is that in a plot for 10 < t < 250 — not 
shown — the orbit moves into some of the unvisited regions of Figure 12.27. If it had 
begun to repeat itself before t = 200, it could not visit new ground after t = 200.) 

The dense tangle of threads running through Figure 12.27 lend strong support to 
the claim that for these parameters the motion is chaotic. Unfortunately, one could not 
claim that the picture sheds much light on the nature of chaotic motion. It is just too 
densely packed with information to convey any useful message. In the next section 
I describe the Poincare section, which is a technique for culling out of pictures like 
Figure 12.27 enough information to allow an interesting pattern to emerge. 



10 <t<, 200 


Figure 12.27 The chaotic state-space orbit for a DDP 
with y = 1.5 and fi = oij 8. In the 190 cycles shown, the 
motion does not repeat itself, and, in fact, it never does. 
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12.8 Poincare Sections 


For the periodic motion of a DDP, a state-space orbit is a descriptive way of viewing 
the pendulum’s history. For chaotic motion, a state-space orbit conveys a sense of the 
dramatic nature of chaos, but is too full of information to be of much serious use. One 
way around this difficulty is a trick which we have used earlier and was suggested by 
Poincare: Instead of following the motion as a function of the continuous variable t, we 
look at the position just once per cycle at times? = t 0 , t 0 + 1, t Q + 2, • • • .The Poincare 
section for a DDP is just a plot showing the pendulum’s “position” [<p (?), (pit)) in state 
space at one-cycle intervals 


t ~ t Q , t 0 + h t 0 + 2, * • • , (12.30) 

with t 0 usually chosen so that the initial transients have died out. 20 To illustrate this, 
consider the state-space orbit shown in Figure 12.28(a) for a DDP with y = 1.078 
(and the damping constant restored to our usual value /J = a> 0 / 4). The two loops of 
this orbit indicate that (as we already knew) the long-term motion has period two. 
To emphasize this I have drawn dots to show the position [(pit), 4>{t)] at one-cycle 
intervals, t = 20, 21, 22, • • •. Since the motion has period two, these alternate be¬ 
tween just two distinct positions and show up as just two dots. In a Poincare section 
one dispenses with the orbit and draws just the dots at one-cycle intervals as in Figure 
12.28(b). When the motion is periodic, there is no particular advantage to the Poincare 
section over the complete state-space orbit of Figure 12.28(a), although the Poincare 
section does show the period very clearly. (A Poincare section with four dots would 
show period-four motion, and so on.) On the other hand, when the motion is chaotic, 
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(a) State-Space Orbit (a) Poincare Section 


Figure 12.28 (a) State-space orbit of a DDP with y = 1.078 for 20 < 

t < 60. The dots show the positions at t — 20, 21, 22, • • •, but, since the 
motion has period two, these alternate between just two fixed points. The 
right point shows the positions for t = 20, 22, • ■ the left one, those for 
t = 21, 23, • • •. (b) In the corresponding Poincare section, one omits the 
orbit and draws only the dots showing the positions at t = 20, 21,22, • • •. 
The presence of just two dots is a clear indication of period-two motion. 


20 For a multidimensional system, the Poincare section involves taking a two-dimesional slice, 
or section, through the multidimensional state space. Hence the word “section.” 
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Figure 12.29 Poincare section for a pendulum with y — 1.5 and damping 
constant = coj 8 for times 10 < t < 60000. This figure is made of nearly 
60000 points showing the “position” [(/>(t), <p(t)] at one-cycle intervals, t = 
10,11, • • •, 60000. The rectangular box indicates the region that is enlarged 
in Figure 12.30. 


no two cycles of the motion are the same, and the state-space orbit can be a real mess, 
as we saw in Figure 12.27. In this case, the Poincare section reveals some totally 
unexpected structures. 

To illustrate a Poincare section for chaotic motion, I chose the pendulum whose 
chaotic state-space orbit was shown in Figure 12.27. It is clear that, since this motion 
never repeats itself, the Poincare section will contain infinitely many points, and these 
infinitely many points will comprise a subset of the points of the full orbit. It is 
probably fair to say that no one could ever guess what this subset would look like, 
but with the aid of a high-speed computer we can find out. The result is shown in 
Figure 12.29. Although it is certainly not obvious exactly what this elegant figure 
signifies, it certainly is obvious that it signifies something. By selecting from Figure 
12.27 just those points at one-cycle intervals, we have reduced the dense, and nearly 
solid, tangle of Figure 12.27 to the elegant curve of Figure 12.29. Actually, while 
Figure 12.29 looks like a relatively simple curve, it is not a curve at all, but rather 
a fractal. A fractal can be defined in various ways, but a characteristic feature of 
fractals is that when one enlarges the scale and zooms in on a portion of the picture, 
one uncovers further structures that are in some ways similar to the original picture 
(somewhat like a photograph of a person holding a photograph of a person holding a 
photograph . . . and so on). To illustrate this property of our Poincare section, I have 
zoomed in on the region indicated by the rectangular box at the bottom of Figure 
12.29. Notice that (at the scale of Figure 12.29) this region comprises a prominent 
“tongue” pointing to the left near the left of the box, with a second tongue inside the 
first near the right of the box. Figure 12.30 is a fourfold enlargement of this box. This 
enlargement makes clear that the apparently single tongue on the right of the box of 
12.29 is actually four tongues, while that on the left is actually at least five. 
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Figure 12.30 Enlargement of the small box at the bottom of the Poincare 
section of Figure 12.29. Each of the two tongues in the box of 12.29 is seen 
to be made up of several tongues. The box on the left is the region that is 
further enlarged in Figure 12.31. 



Figure 12.31 A further enlargement of the box at the left of the enlargement 
of Figure 12.30. Each of the five tongues in the box of 12.30 (except perhaps 
the innermost one) is seen to be made up of several tongues. 


This process of zooming in on successively smaller regions of the Poincare section 
can, at least in principle, be continued indefinitely. Figure 12.31 is a further fourfold 
enlargement of the region shown by the gray rectangle on the left of Figure 12.30. In 
this enlargement, we see that each of the five tongues on the left of Figure 12.30 (except 
perhaps the fifth one) actually consists of several separate tongues. This so-called self 
similarity of the figure is one of the characteristic features of a fractal. 

When the Poincare section of the motion of a chaotic system is a fractal, the long¬ 
term motion is said to be a strange attractor. It would unfortunately be well beyond 
the scope of this book to explain what it signifies that the Poincare section of a chaotic 
attractor is fractal, and indeed there is still much about this phenomenon that is not 
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(a) Experiment (b) Theory 



Figure 12.32 Poincare section for a DDP. (a) Experimental results 
using the Daedalon Chaotic Pendulum, (b) Theoretical prediction 
using the same parameters as in part (a). Courtesy of Professors H.J.T. 
Smith and James Blackburn and the Daedalon Corporation. 


understood. Nevertheless, it is undeniably fascinating that the strange geometrical 
structure of the fractal appears in our study of the long-term behavior of chaotic 
systems. This discovery has stimulated much research on both the physics of chaotic 
systems and the mathematics of fractals. 

To observe a strange attractor with a real pendulum would obviously be challeng¬ 
ing, but once again the experimentalists have risen to the challenge. Figure 12.32 
shows a Poincare section made with the Daedalon chaotic pendulum. Part (a) shows 
the experimental results and part (b) the theoretical prediction (that is, a numerical 
solution of the equation of motion using the experimental values of the parameters). 
Considering the great subtlety of these graphs, the agreement is outstanding. 21 


12.9 The Logistic Map 


As I have repeatedly emphasized, the phenomenon of chaos appears in many different 
situations. In particular, there are certain systems that can exhibit chaos, but whose 
equations of motion — called maps — are simpler than the equations of any mechan¬ 
ical system. Although these systems are not strictly part of classical mechanics, they 
are worth mentioning here, for several reasons: Because their equations of motion are 
simple, several aspects of their motion can be understood using quite elementary meth¬ 
ods. Any understanding of chaos that we get from studying these simpler systems can 
shed light on the corresponding behavior of mechanical systems. In particular, there 
is an intimate connection between these “maps” and the Poincare sections of mechan¬ 
ical systems. Finally, a discussion of chaos in this new context highlights the diversity 
of systems that exhibit the phenomenon. 


21 There are, nevertheless, differences. One possible cause is the difficulty of making a drive 
motor that is perfectly sinusoidal. 
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Discrete Time and Maps 

In almost all problems of mechanics, one is concerned with the evolution of a system 
as time advances continuously. However, there are systems for which time is a discrete 
variable. The history of any event that occurs just once a year, such as the Super Bowl, 
is an example. The score at Super Bowl games is defined only at a sequence of discrete 
times starting in 1967 and spaced at one-year intervals: 

t — 1967, 1968,1969, ■ • -. (12.31) 

The attendance at your weekly lunch group is defined only for discrete times spaced a 
week apart. The total rainfall in the annual Indian monsoon is defined only for discrete 
times spaced a year apart. 

Even when a variable is defined as a function of continuous time, we may find 
that we need its values only at certain discrete times. For example, entymologists 
studying the population of a particular bug may have no interest in the population’s 
day-by-day evolution; rather they may need to record the bug population just once 
a year, immediately after the year’s new arrivals have hatched. Another example of 
this situation is the Poincare section for a mechanical system, as described in Section 
12.8. Our ultimate interest is to know the state of the system for all (continuous) 
times, but we saw that it is sometimes useful to record just the state at discrete, one- 
cycle intervals. To the extent that we are prepared to settle for this smaller amount of 
information, the Poincare section reduces the problem of the pendulum (or whatever) 
to a discrete-time problem, and anything we can learn about discrete-time systems 
should shed light on the possible behavior of Poincare sections. 

In the case of the driven damped pendulum, we know that the state of the system, 
as given by the pair [(pit), <p(t)], at any time t determines uniquely the state at any 
later time. In particular, it determines the state [(pit + 1), (pit + 1)] one cycle later. 
This means that there exists a function / (which we don’t know, but which certainly 
exists) that, acting on any chosen pair [<p(t),<j>(t)], gives the corresponding pair 
[(pit -)- 1), (pit -f- 1)]. That is, 

[(Pit + 1), fat + 1)] = / ([(Pit), bit)]) . (12.32) 

In the same way, we could imagine a bug species with the property that the population 
n t+ i in year t + 1 is uniquely determined 22 by the population n, in the preceding year 
t. Again this would imply the existence of a function / that carries any n t onto its 
corresponding n t+l \ 


n,+i = fin t ). 


(12.33) 


22 This is, of course, a very simplified model. In the real world, the population n t+l certainly 
depends on n t , but also on many other factors, such as the rainfall in year t, the supply of bug food, 
and the population of birds that like to eat the bug. Nevertheless, we can imagine a temperate island, 
with a constant supply of bug food and no bug predators, on which n t+l is uniquely determined by 
n t alone. 
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We can call an equation of this form the growth equation for the population con¬ 
cerned. 

example i2.i Exponential Population Growth 

I The simplest example of a growth equation of the type (12.33) is the case where 
j n, +1 is proportional to ft,: 

| n m | = /(ft,) = rn t . (12.34) 

That is, the function / (n) that gives next year’s population in terms of this year’s 
1 is just 

j /(ft) = rn (12.35) 

| where the positive constant r could be called the growth rate or growth param¬ 
eter of the population. [For example if every bug alive this spring dies before 
j next spring but leaves two surviving offspring, then the population would satisfy 
| (12.34) with r = 2.] Solve the equation (12.34) for ft, in terms of n 0 and discuss 

the long-term behavior of ft,. 

The solution to (12.34) is easily seen by inspection. Observe that 
j n\ = f(n 0 ) = m 0 

and 

«2 = f(n i) = /(/(ft 0 )) = r 2 n 0 
j from which it is clear that 

] t terms 

ft, = /(ft,_ i) = /(/(•••/(ft,,) •••)) = r'n 0 . (12.36) 

We see that if r > 1, the population ft, grows exponentially, approaching infinity 
I as t ->■ oo. If r = 1, the population stays constant, and if r < 1 it decreases 
j exponentially to zero. 

Before we discuss a more interesting growth equation, I need to introduce some 
terminology. In mathematics, the words “function” and “map” are used as almost 
exact synonyms. Thus we can say that Equation (12.33) defines n t+l as a function of 
ft,. Or we can say that (12.33) is a map in which / carries ft, onto the corresponding 23 
ft, +1 , a relationship that we can represent thus: 

n t n t+ 1 = /(ft,). (12.37) 

23 The origin of this rather strange usage seems to be in cartography. A cartographer’s map 
of the US, for example, establishes a correspondence between each actual point of the US and the 
corresponding point on a piece of paper, in somewhat the same way the function y = / (x) establishes 
a correspondence between each value x and the corresponding value y = /(x). 


i 
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The whole sequence of numbers, n 0 , n h n 2 , ■ ■ ■ can be written similarly as 

n 0 —n x — U- n 2 — n 3 • • • (12.38) 

and is naturally described as an iterated map, or just map for short. For some reason 
the word “map” (as opposed to “function”) is used almost universally to describe 
relationships like (12.37) between successive values of any discrete-time variable. The 
map (12.37) is a one-dimensional map since it carries the single number n t onto the 
single number n t+x . The corresponding relationship (12.32) for the Poincare section of 
aDDP defines a two-dimensional map, since it carries the pair of numbers [(pit), <p(t )] 
onto the pair [<p(t + 1), 4>{t + 1)]. 

The Logistic Map 

The exponential map of Example 12.1 with r > 1 is a reasonably realistic model for 
the initial growth of many populations, but no real population can grow exponentially 
for ever. Something — overcrowding or shortage of food, for example — eventually 
slows down the growth. There are many ways to modify the map (12.34) to give a more 
realistic model of population growth. One of the simplest is to replace the function 
fin ) = rn of (12.34) by 


fin) = mil - n/N) (12.39) 

where N is a large positive constant, whose significance we shall see directly. That is, 
we replace the exponential map (12.34) with the so-called logistic map 


n t+l = fin t )=rn t il-n t /N). (12.40) 

As long as the population remains small compared to N, the term n t /N in (12.40) 
is unimportant, and our new map produces the same exponential evolution as the 
exponential map. But if n t grows toward N, the term nJN becomes important, and the 
parenthesis (1 — n t /N) begins to diminish and “kill off” some of the excess growth. 
Thus this “mortality factor” (1 — n/N) in (12.39) produces exactly the expected 
slowing of the population growth as n becomes large, and overcrowding or starvation 
become important. In particular, if n t were to reach the value N, the parenthesis 
(1 - n t /N) in (12.40) would vanish, and next year’s population n t+] would be zero. If 
n t were greater than N, then n t+l would be negative — which is impossible. In other 
words, a population governed by the logistic map (12.40) can never exceed the value 
N, the maximum or carrying capacity of the model. Notice that, because of the term 
involving n in the mortality factor, the logistic map (12.39) (unlike the exponential 
map) is nonlinear. It is this nonlinearity that makes possible the chaotic behavior of 
the logistic map. 

Figure 12.33 compares exponential and logistic growth, for a growth parameter 
r = 2 and an initial population n 0 = 4. The upper curve (gray dots) shows the unending 
doubling of the exponential case; the lower curve (black dots) shows the growth 
predicted by the logistic map (12.40), with the same growth parameter r = 2 and with 
a carrying capacity N = 1000. As long as n remains small (much less than 1000), the 
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Figure 12.33 Exponential and logistic growth, both with growth parameter 
r = 2. The gray dots show the exponential growth, increasing without limit; 
the black dots show the logistic growth, which eventually slows down and 
approaches an equilibrium at n = 500. The lines joining the dots are just to 
guide the eye. 


logistic growth is indistinguishable from pure exponential growth, but once n reaches 
100 or so, the mortality factor visibly slows the logistic growth, which eventually 
levels off at around n — 500. 

Before we discuss the logistic map in any detail it is convenient to simplify it by 
changing variables from the population n to the relative population, 

x = n/N, (12.41) 

the ratio of the actual population n to its maximum possible value N. Dividing both 
sides of (12.40) by N, we see that x t obeys the growth equation 


x t+ \ = f(x t )*=rx e (l-x t ) (12.42) 

where I have redefined the map / as a function of x to be 

/(x)=rx(l-x). (12.43) 


Since the population n is confined to the range 0 < n < N, the relative population 
x = n/N is restricted to 


0 < x < 1. (12.44) 

Within this range, the function x(l - x) has a maximum of 1/4 (at x = 1/2). Thus, to 
guarantee that x J+1 , as given by (12.42), never exceeds 1, we must limit the growth 
factor to 0 < r < 4. Therefore, we shall be studying the map (12.42) in the ranges 
0 < x < 1 and 0 < r < 4. 
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Figure 12.34 The relative population x t = n t /N for the logisitic map (12.42), 
with two different initial conditions for each of two different growth rates, 
(a) With the growth parameter r = 0.8, the population rapidly approaches zero 
whether x 0 = 0.1 or x 0 = 0.5. (b) With r = 1.5 and the same two initial condi¬ 
tions, the population approaches the fixed value 0.33. 


Before we look at some of the exotic aspects of the logistic map, let us first look at 
a couple of cases where it behaves just as one might expect. Figure 12.34(a) shows the 
logistic population for a growth parameter r = 0.8 and for two different initial values, 
x 0 = 0.1 and x 0 = 0.5. You can see that in either case x t 0 as t -»■ oo. In fact it is 
easy to see that as long as r < 1, the population eventually goes to zero whatever its 
initial value: From (12.42) we see that x t < rx t _ 1 and hence that x t < r'x 0 ; therefore, 
if r < 1 , we conclude that x t -»• 0 as t -» oo. 

Figure 12.34(b) shows the logistic population for a growth parameter r = 1.5 and 
for the same two initial conditions. For x 0 = 0.1 we see that the population increases at 
first. On the other hand, for the larger initial value x 0 = 0.5 the mortality factor causes 
the population to decrease at first. In either case, it eventually levels out at x = 0.33. 


Fixed Points 

In both the cases shown in Figure 12.34 we can say that the logistic map has a constant 
attractor towards which the population eventually moves, namely x — 0 for any 
r < 1, and x = 0.33 for r = 1.5. If the population happens to start out equal to such a 
constant attractor, x 0 — x* say, then it simply remains fixed there for all time; that is, 
x t = x* for all t. This obviously happens if and only if 

f(x*)=x*. (12.45) 

Any value x* which satisfies this equation is called a fixed point of the map /. These 
fixed points are analogous to the equilibrium points of a mechanical system, in that a 
system which starts at a fixed point remains there for ever. 

For a given map, we can solve the equation (12.45) to find the map’s fixed points. 
For example, the fixed points of the logistic map must satisfy 


rx*( 1 — x*) = x*, 


(12.46) 



504 


Chapter 12 Nonlinear Mechanics and Chaos 


*r+l 



Figure 12.35 Graphs of x against x (the 45° line) and of the logistic 
function fix) = rx(l — x) (the two curves) for two choices of r, one 
less than and one greater than 1. The fixed points of the logistic map 
lie at the intersections of the line with the curve. When r < 1 there is 
just one intersection, at x* = 0; when r > 1 there are two intersections, 
one still at x* = 0 and the other at an x* > 0. 


which is easily solved to give 

x* = 0 or x* = ———. (12.47) 

r 

The first solution is the fixed point x* = 0 that we have already noted. The second 
solution depends on the value of r. For r < 1 it is negative and hence irrelevant. For 
r = 1 it coincides with the first solution x* = 0, but for r > 1 it is a distinct, second 
fixed point. For example, for r = 1.5 it gives the fixed point we have already noted at 
x* = 1/3. 

It is a fortunate circumstance that we can actually solve the equation (12.45) 
analytically to find the fixed points of the logistic map, but it is also instructive 
to examine the equation graphically, since graphical considerations give additional 
insight and can be applied to many different maps, some of which cannot necessarilly 
be solved analytically. To solve Equation (12.45) graphically, we just plot the two 
functions x and fix) against x as in Figure 12.35 and read off the fixed points as 
those values x* where the two graphs intersect. When r is small, the curve of fix) 
lies below the 45° line of x against x, and the only intersection is at x = 0; that is, the 
only fixed point is x* = 0. When r is large the curve of fix) bulges above the 45° line 
and there are two fixed points. The boundary between the two cases is easily found 
by noting that the slope of fix) at x = 0 is just r. Thus as we increase r, the curve 
moves across the 45°line (whose slope is 1) when r = 1. Thus for r < 1 there is just 
one fixed point at x* = 0; but when r > 1 there are two fixed points, one at x* = 0 
and the other at an x* > 0. The advantage of this graphical argument is that it works 
equally well for any similar function fix) as long as it is a single concave-down arch. 
[For example, fix) could be the function fix) = r sin(7rx) of Problem 12.23.] 
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A Test for Stability 

That x* is a fixed point (that is, an equilibrium value) for the logistic map guarantees 
that when the population starts out at x* it will stay there. By itself this is not enough 
to ensure that the value x* is an attractor for the map. We need to check in addition 
that x* is a stable fixed point, that is, that if the population starts out close to x* 
it will evolve towards x* not away from it. (This issue has an exact parallel in the 
study of equilibrium points of a mechanical system: If a system starts out exactly at 
an equilibrium point then it will — in principle — stay there indefinitely. But only if 
the equilibrium is stable will the system move back to equilibrium if disturbed a little 
away.) 

There is a simple test for stability, which we can derive as follows: If x t is close to 
a fixed point x*, we naturally write 


x t =x* + e t . (12.48) 

That is, we define e f as the distance of x t from the fixed point x*. If c t is small, this 
lets us evaluate x r+1 as 

X t +1 ~ /(-*,) = f{x* + € t ) 

fix*) + f\x*)e t = x* + \e t (12.49) 

where in the last expression I have used the fact that x* is a fixed point [so that 
/(x*) = x*] and I have introduced the notation A for the derivative of /(x) at x*, 

A = f(x*). (12.50) 

Now, according to (12.48), x f+1 = x* + € t+l . Comparing this with the last expression 
of (12.49) we see that 


€ t+1 ^Xe t . (12.51) 

Because of this simple relation, the number A = f'(x*) is called the multiplier or 
eigenvalue of the fixed point. It shows that if |A| < 1, then once x t is close to x*, 
successive values get closer and closer to x*. On the other hand, if |A| > 1, then when 
x t is close to x*, the succeeding values move away from x*. This is our required test 
for stability: 


Stability of Fixed Points 

Let x * be a fixed point of the map x t+l = /(x r ); that is, fix*) = x*.If !/'{**)! < 
1, then x* is stable and acts as an attractor. If |/'(x*» > 1, then x* is unstable 
and acts as a repeller. 


We can immediately apply this test to the two fixed points of the logistic map: 
Since /(x) = rx( 1 — x), its derivative is 


fix) = ri l-2x). 
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Figure 12.36 The fixed points x* of the logistic map as functions of 
the growth parameter r. The solid curves show the stable fixed points 
and the dashed curves the unstable. Note how the fixed point x* — 0 
becomes unstable at precisely the place (r = 1) where the nonzero fixed 
point first appears. 


At the fixed point x* = 0, this means the crucial derivative is 
f{x*) = r- 

therefore, the fixed point x* = 0 is stable for r < 1 but unstable for r > 1. At the fixed 
point x* = (r — l)/r, the derivative is 

fix*) = 2 -r, 

so that this fixed point is stable for 1 < r < 3, but unstable for r > 3. These results 
are summarised in Figure 12.36, which shows the fixed-point values x* as functions 
of the growth parameter r with the stable fixed points shown as solid curves and the 
unstable as dashed curves. 

The arguments just given show exactly when each of the two fixed points becomes 
unstable and that the second one appears exactly when the first becomes unstable. On 
the other hand, it would be nice to have an argument that made clearer why the fixed 
points behave the way they do and showed more clearly (what is true) that the same 
qualitative conclusions apply to any other one-dimensional map with the same general 
features. Such an argument can be found by examining the graph of Figure 12.35. In 
that figure, we saw that the fixed points of the map correspond to intersections of the 
curve of f(x) against x with the 45° line (x against x). When r is small, it was clear 
that there is only one such intersection at x* = 0. As r increased the curve bulged up 
more and eventually crossed the 45° line producing a second intersection and hence 
a second, nonzero fixed point. We saw that this second intersection appears when 
the slope /'(0) of the curve at x = 0 is exactly 1 (that is, when it is tangent to the 
45° line). In light of our test for stability, this means the second fixed point has to 
appear at exactly the moment when the first one at x* = 0 changes from stable to 
unstable. 


j 
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The First Period Doubling 

Figure 12.36 tells the whole story of the fixed points of the logistic map. The solid 
curves show the stable fixed points, which are the constant attractors. We see that for 
r < 1, the logistic population approaches 0 as t oo; for 1 <r < 3, it approaches the 
other fixed point x* = (r — l)/r as t oo. But what happens — and this proves the 
most interesting question — when r > 3? In the computer age this is easily answered. 
Figure 12.37 shows the first 30 cycles of the logistic population with a growth 
parameter of r = 3.2. The striking feature of this graph is that it no longer settles 
down to a single constant value. Instead, it bounces back and forth between the two 
fixed values shown as x a and x b , repeating itself once every two cycles. In the language 
developed for the driven damped pendulum, we can say that the period has doubled 
to period two, and this period-two limiting motion is called a two-cycle. 



Figure 12.37 A logistic population with growth parameter r = 3.2. 

The population never settles down to a constant value; rather, it os¬ 
cillates between two values, repeating itself once every two cycles. In 
other words, it has doubled its period to period two. 

We can understand the doubling of the period of the logistic map with the graphical 
methods already developed, although the argument is a bit more complicated. The 
essential observations are these: First, neither of the two limiting values x a and x b is 
a fixed point of the map fix). Instead, 


f(xa) = x b and f(x b )=x a . (12.52) 

Let us, however, consider the double map (or second iterate map) 

g(*) = /(/(*)), (12.53) 

which carries the population x t onto the population two years hence, 

*t+2 = g(x t ). (12.54) 

It is clear from Figure 12.37 or Equations (12.52) that both x a and x b are fixed points 
of the double map g(x), 24 

g(x a ) = x a and g(x b ) = x b . (12.55) 


: These points are also called second-order fixed points. 
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(a) (b) 

Figure 12.38 The logistic map /(x) and its second iterate g(x) = /(/(x)). 
(a) For a growth parameter r = 0.8, each function is a single arch which 
intersects the 45° line just once, at the origin, (b) When r = 2.6, /(x) 
is higher than before, but still just a single arch; g(x) has developed two 
maxima with a valley in between. Both functions have acquired a second 
intersection with the 45° line, at the same point marked x*. 


Thus to study the two-cycles of the map /(x) we have only to examine the fixed 
points of the double map g(x) = /(/(x)), and for this we can use our understanding 
of fixed points. Before we do this, it is worth noticing that any fixed point of /(x) is 
automatically also a fixed point of g(x). (If x f+1 = x, for all t, then certainly x l+2 = x t .) 
Therefore, the two-cycles of /(x) correspond to those fixed points of g(x) that are 
not also fixed points of /(x). 

Since /(x) is a quadratic function of x, it follows that g(x) = /(/(x)) is a quartic 
function, whose explicit form can be written down and studied. However, we can gain 
a better understanding by considering its graph. When r is small, we know that f(x) is 
a single low arch (as in Figure 12.35) and you can easily convince yourself that g(x) is 
an even lower arch, as sketched in Figure 12.38(a), which shows both functions for a 
growth parameter r = 0.8. As r increases, both arches rise, and by the time r = 2.6 the 
function g(x) has developed two maxima as seen in Figure 12.38(b). (You can explore 
the reason for this development in Problem 12.26.) Also, both curves now intersect 
the 45° line twice, once at the origin and once at the fixed point x* = (r — l)/r. That 
both curves intersect the 45° line at the same points shows two things: First, as we 
already knew, every fixed point of /(x) is also a fixed point of g(x), and, second, (for 
the growth parameters shown in this figure) every fixed point of g(x) is also a fixed 
point of /(x); that is, there are no two-cycles yet. 

As we increase the growth parameter r still further, the two crests of the double map 
g(x) continue to rise, while the valley between them gets lower. (Again see Problem 
12.26 for the reason.) Figure 12.39 shows the curves of /(x) and g(x) for growth 
parameters r = 2.8, 3.0, and 3.4. With r = 2.8 [part (a)], the double map g(x) still 
has just the same two fixed points as /(x), at x = 0 and at the point indicated as x*. By 
the time r = 3.4 [part (c)], the double map has developed two additional fixed points, 
shown as x a and x b \ that is, the logistic map now has a two-cycle. The threshold value 
at which the two-cycle appears is clearly the value for which the curve g(x) is tangent 
to the 45° line^namely r = 3 for the logistic map, as in part (b) of the figure. If 
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Figure 12.39 The logistic map /(x) (solid curves) and its second iterate g(x) = 
/(/(x)) (dashed) for r = 2.8, 3.0, and 3.4. (a) With r = 2.8, the map g(x) has 
the same two fixed points as / (x), namely x = 0 and x = x* as shown, (b) When 
r reaches 3.0, the curve of g (x) is tangent to the 45° line, and for any larger value 
of r, as in (c), the map g(x) has two extra fixed points labelled x a and x h . 


r < 3, the curve g(x) crosses the 45° line just once, at x*; if r > 3 it crosses three 
times, once at x* and two more times, at x a and x b (one above and one below x*). 

We already know that r = 3 is the threshold at which the fixed point x* becomes 
unstable. Thus, Figure 12.39 shows that the two-cycle appears at the moment when 
the “one-cycle” (that is, the fixed point) becomes unstable. Happily, we are now in a 
position to see why this has to be. We have already noted that the two-cycle appears 
precisely when the curve g(x) is tangent to the 45° line at the point x*, that is, when 

g'(x*) = 1 (12.56) 

To see what this implies, let us evaluate the derivative g'(x) of the double map g(x) 
at either of the two-cycle fixed points, x a say, 

g'(x a ) = -j~g(x) I = y-/(/(*))| = /'(/(*)) • fix) 

ax \ x ax \ x 

= f(x b )f(x a ). (12.57) 

Here in the last expression of the first line I have used the chain rule, and in the second 
line I have used the fact that f(x a ) —x b . Let us apply this result to the birth of the two- 
cycle in Figure 12.39(b). At the moment of birth, x* = x a = x h , and we can combine 
(12.56) and (12.57) to give 

[/V)f = 1- 

This means that |/'(x*)| = 1, and, by our test for stability, we see that the moment 
when the two-cycle is born is precisely the moment when the fixed point x* becomes 
unstable. 

We can use these same techniques to explore what happens as we increase r still 
further. For example, one can show (Problem 12.28) that the two-cycle that we have 
just seen appearing at r = 3 becomes unstable at r = 1 + \/6 = 3.449 and is succeeded 
by a stable four-cycle. However, to keep this long chapter from growing totally out of 
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Figure 12.40 The fixed-points and two-cycles of the logistic map as 
functions of the growth parameter r. The solid curves are stable and 
the dashed curves unstable. 


bounds, I shall just sketch briefly some highlights that can be found by exploring the 
logistic map numerically. 


Bifurcation Diagrams 

In Figure 12.36 we saw the complete history of the fixed points of the logistic map 
itself. If we add onto that picture the fixed points x a and x b of the double map 
g(x) = f(f(x)) (that is, the two-cycles of the logistic map) we get the graphs shown 
in Figure 12.40. These curves are reminiscent of the beginnings of the bifurcation 
diagram, Figure 12.17, for the driven damped pendulum. In fact, we can redraw 
Figure 12.40 using the same procedure as was used for Figure 12.17: First one picks a 
large number of equally spaced values of the growth parameter in the range of interest. 
(I chose the range 2.8 < r < 4, since this is where the excitement lies, and chose 
1200 equally spaced values in this range.) Then for each value of r one calculates 
the populations x 0 , x h x 2 , ■ ■ •, x t max , where f max is some very large time. (I chose 
?max = 1000.) Next one chooses a time f min large enough to let all transients die out. 
(I chose r min = 900.) Finally, in a plot of x against r, one shows the values of x t for 
t mi „ < t < t max as dots above the corresponding value of r. The resulting bifurcation 
diagram for the logistic map is shown in Figure 12.41. 

The similarity of the bifurcation diagram of Figure 12.41 for the logistic map and 
Figure 12.17 for the driven pendulum is striking indeed. The interpretation of the two 
pictures is also similar. On the left of Figure 12.41 we see the period-one attractor 
for r < 3, followed by the first period doubling at r = 3. This is followed by the 
second doubling at r — 3.449, and a whole cascade of doublings that end in chaos 
near r = 3.570. With the help of this diagram we can predict the long-term behavior 
of the logistic population for any particular choice of r (though there is clearly much 
fine detail that cannot be distinguished at the scale of Figure 12.41). For example at 
r = 3.5, it is clear that the population should have period four, a claim that is borne 
out in Figure 12.42(a), which shows the twenty cycles from t = 100 to 120 for this 
value of r. Similarly, around r = 3.84, sandwiched between wide intervals of chaos, 
you can see a narrow window that appears to have period three, an observation borne 
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Figure 12.41 Bifurcation diagram for the logistic map. A period-doubling cas¬ 
cade is clearly visible starting at r x = 3, with a second doubling at r 2 = 3.449, and 
ending in chaos at r c = 3.570. Several windows of periodicity stand out clearly 
amongst the chaos, especially the period-three window near r = 3.84. The tiny 
rectangle near r — 3.84 is the region that is enlarged in Figure 12.44. 




Figure 12.42 Long-term evolution of logistic populations with growth param¬ 
eters r = 3.5 and 3.84. (a) Period four. With r = 3.5, the twenty cycles 100 < 
t < 120 take on just four distinct values at intervals of four cycles. (The dashed 
horizontal line is just to highlight the constancy of every fourth dot.) (b) Period 
three. With r = 3.84, the population repeats every three cycles. 


out in Figure 12.42(b). As an example of chaos. Figure 12.43 shows the evolution 
of a population with r = 3.7; in the eighty cycles shown, there is no evidence of any 
repetition. 

From Figure 12.41 (and careful enlargements), one can read off the threshold 
values of r at which the period doublings occur. If we denote by r n the threshold 
at which the cycle of period 2" appears, these are found to be 
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Figure 12.43 Chaos. With r = 3.7, the eighty cycles, 100 <t< 180, of the 
logistic map show no tendency to repeat themselves. 


As you can check, the separation between successive thresholds shrinks geomet¬ 
rically, just like the corresponding intervals for the period doubling of the driven 
pendulum. In fact, (see Problem 12.29) the numbers (12.58) give a remarkable fit 
to the Feigenbaum relation (12.17) that we first met in connection with the DDP, with 
the same Feigenbaum constant (12.18). Another striking parallel with the driven pen¬ 
dulum is this (Problem 12.30): For those r for which the evolution is non-chaotic, if 
two populations start out sufficiently close to one another, their difference will con¬ 
verge exponentially to zero as t —* oo. For those r for which the evolution is chaotic, 
the same difference diverges exponentially as t -» oo. That is, the chaotic evolution 
of the logistic map shows the same sensitive dependence on initial conditions that we 
found for the driven pendulum. 

Perhaps the most striking feature of the logistic bifurcation diagram is that, when 
one zooms in on certain parts of the diagram, a perfect self similarity emerges. The 
small rectangle near r = 3.84 in Figure 12.41 has been enlarged many fold in Figure 
12.44. Apart from the facts that this new picture is upside down and its scale is vastly 
different, it is a perfect copy of the whole original diagram of which it is a part. This is 
a striking example of the self similarity which appears in many places in the study of 
chaos and which we met in connection with the Poincare section of the DDP shown 
in Figures 12.29,12.30, and 12.31. 

There are many other features of the logistic map and more parallels with the DDP, 
all worth exploring and some treated in the problems at the end of this chapter. Here, 
however, I shall leave the logistic map and close this chapter with the hope that you 
feel at home with some of the main features of chaos and the tools used to explore 
them. I hope too that your appetite has been whetted to explore this fascinating subject 
further. 25 


25 For a comprehensive history, with very little mathematics, see Chaos, Making a New Science 
by James Gleick, Viking-Penguin, New York (1987). For a quite mathematical, but highly readable, 
account of chaos in many different fields see Nonlinear Dynamics and Chaos by Steven H. Strogatz, 
Addison-Wesley, Reading, MA (1994). Two books which focus mostly on chaos in physical systems 
are Chaotic Dynamics: An Introduction by G.L. Baker and J.P.Gollub, Cambridge Universtiy Press, 
Cambridge (1996) and Chaos and Nonlinear Dynamics by Robert C. Hilbom, Oxford University 
Press, New York (2000). 
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Figure 12.44 A many-fold enlargement of the small rectangle in the logistic 
bifurcation diagram of Figure 12.41. This tiny section of the original diagram 
is a perfect, upside-down copy of the whole original. Note that this section is 
just one of three strands in the original; thus, although this diagram starts out 
looking just like period 1 doubling to period 2, it is actually period 3 doubling 
to period 6 and so on. 


Principal Definitions and Equations of Chapter 12 _ 

The Driven Damped Pendulum 

A damped pendulum that is driven by a sinusoidal force F(t ) = F 0 cos (cut) satisfies 
the nonlinear equation 

0 + 2/30 + col s ^ n< ^ = Y M o cos (ot [Eq. (12.11)] 

where y — FJmg is called the drive strength and is the ratio of the drive amplitude 
to the weight. 


Period Doubling 

For small drive strengths, (y 2S 1) the long-term response, or attractor, of the pen¬ 
dulum has the same period as the drive force. But if y is increased past y l = 1.0663, 
for certain initial conditions and drive frequencies, the attractor undergoes a period¬ 
doubling cascade, in which the period repeatedly doubles, approaching infinity as 
y —> y c = 1.0829. [Section 12.4] 


Chaos 

If the drive strength is increased beyond y c , at least for certain choices of drive 
frequency and initial conditions, the long-term motion becomes nonperiodic, and we 
say that chaos has set in. As y is increased still further, the long-term motion varies, 
sometimes chaotic, sometimes periodic. [Sections 12.5 & 12.6] 
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Sensitivity to Initial Conditions 

Chaotic motion is extremely sensitive to initial conditions. If two identical chaotic 
pendulums with identical drive forces are launched with slightly different initial 
conditions, their separation increases exponentially with time, however small the 
initial difference. [Section 12.5] 


Bifurcation Diagrams 

A bifurcation diagram is a plot of the system’s position at discrete times, t 0 , t 0 + 
1, t Q + 2, • • • (more generally t Q , t Q + r, t Q + 2r, • • •) as a function of the drive strength 
(more generally the appropriate control parameter). [Figures 12.17 & 12.18] 


State-Space Orbits and Poincare Sections 

The state space for a system with n degrees of freedom is the 2n-dimensional space 
comprising the n generalized coordinates and the n generalized velocities. For the 
DDP, the points in state space have the form (0, 0). A state-space orbit is just the 
path traced in state space by a system as t evolves. A Poincare section is a state-space 
orbit restricted to discrete times t 0 , t Q + 1, t 0 + 2, • • • (and, when n > 2, to a subspace 
of fewer dimensions). [Sections 12.7 & 12.8] 

The Logistic Map 

The logistic map is a function (or “map”) that gives a number x t at regular discrete 
intervals (for example, the relative population of a certain bug once each year) as 

x t+1 = rx t ( 1 - x t ). [Eq. (12.42)] 

Although this is not a mechanical system, it exhibits many of the features (period 
doubling, chaos, sensitivity to initial conditions) of nonlinear mechanical systems. 

[Section 12.9] 


Problems for Chapter 12 _ 

Stars indicate the approximate level of difficulty, from easiest (+) to most difficult (*★*). 

Warning: Even when the motion is nonchaotic, it can be very sensitive to tiny errors. In several of the 
computer problems you may need to increase your working precision to get satisfactory results. 

section 12.1 Linearity and Nonlinearity 

12.1 * Consider the nonlinear first-order equation i = 2 Vx — 1. (a) By separating variables, find a 
solution x^r). (b) Your solution should contain one constant of integration k, so you might reasonably 
expect it to be the general solution. Show, however, that there is another solution, x 2 (t) = 1, that is 
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not of the form of x { (t) whatever the value of k. (c) Show that although x x {t) and x 2 (t) are solutions, 
neither Ax { (t), nor Bx 2 (t), nor Xj(0 + x 2 (t) are solutions. (That is, the superposition principle does 
not apply to this equation.) 

12.2 * Here is a different example of the disagreeable things that can happen with nonlinear equations. 
Consider the nonlinear equation x — 2«Jx. Since this is first-order, one would expect that specification 
of x (0) would determine a unique solution. Show that for this equation there are two different solutions, 
both satisfying the initial condition x(0) = 0. [Hint: Find one solution Xj(r) by separating variables, 
but note that x 2 {t) = 0 is another. Fortunately none of the equations normally encountered in classical 
mechanics suffer from this disagreeable ambiguity.] 

12.3 ★ Consider a second-order linear homogeneous equation of the form (12.6). (a) Write out a 
detailed proof of the superposition principle, that if Xj(t) and x 2 (t) are solutions of this equation, then 
so is any linear combination a y x x (t) + a 2 x 2 (t), where a y and a 2 are any two constants, (b) Consider 
now the nonlinear equation in which the third term of (12.6) is replaced by r(t)y/x(t). Explain clearly 
why the superposition principle does not hold for this equation. 

12.4 * Consider an inhomogeneous second-order linear equation of the form 

mm + q{t)k{t) + r(t)x{t) = f(t). (12.59) 

Let x p (t) denote a solution (a “ particular ” solution) of this equation and prove that any solution x(f) 
can be written as 


x(t) = * P (0 + «i-'i(0 + a 2 x 2 (t) (12.60) 

where xj(f) and x 2 (t) are two independent solutions of the corresponding homogeneous equation — 
that is, (12.59) with f(t) deleted. [Hint: Write down the equations for x(t) and x p (t) and subtract.] 
This result shows that to find all solutions of (12.59), we have only to find one particular solution and 
two independent solutions of the corresponding homogeneous equation, (b) Explain clearly why the 
result you proved in part (a) is not, in general, true for a nonlinear equation such as 

p(t)x(t) + q{t)x(t) + r(t)y/x(t) = f(t). 


section 12.3 Some Expected Features of the DDP 

12.5 * Use Euler’s relation and the corresponding expression for cos 0 (inside the front cover) to prove 
the identity (12.15). 

12.6 ★* [Computer] (a) Use appropriate sofware to solve the equation (12.11) numerically, for a DDP 
with the following parameters: drive strength y = 0.9, drive frequency a> = 2n, natural frequency 
eo 0 -- l.5co, damping constant f3 = &> 0 /4, and initial conditions 0(0) = 0(0) = 0. Solve the equation 
and plot your solution for six cycles, 0 < t < 6, and verify that you get the result shown in Figure 12.3. 
(b) and (c) Solve the same equation twice more with the two different initial conditions 0 (0) = ±7r/2 — 
both with 0 (0) = 0 — and plot all three solutions on the same picture. Do your results bear out the claim 
that, for this drive strength, all solutions (whatever their initial conditions) approach the same periodic 
attractor? 
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section 12.4 The DDP: Approach to Chaos 

12.7 ** [Computer] Do all the same calculations as in Problem 12.6 but with a drive strength y = 1.06 
and for 0 < t < 10. In part (a) verify that your results agree with Figure 12.4. Do your results suggest 
that there is still a unique attractor to which all solutions (whatever their initial conditions) converge? 
(At first sight the answer may appear to be “No,” but remember that values of 0 that differ by 2n should 
be considered to be the same.) 

12.8** [Computer] Use a computer to find a numerical solution of the equation of motion (12.11) 
for a DDP with the following parameters: drive strength y = 1.073, drive frequency to = 2n, natural 
frequency co 0 = 1.5&>, damping constant = coj 4, and initial conditions 0(0) = n/2 and 0(0) = 0. 
(a) Solve for 0 < t < 50, and then plot the first ten cycles, 0 < t < 10. (b) To be sure that the initial 
transients have died out, plot the ten cycles 40 < t < 50. What is the period of the long-term motion 
(the attractor)? 

12.9 ** [Computer] Do the same calculations as in Problem 12.8 with all the same parameters except 
that 0(0) = 0. Plot the first 30 cycles 0 < t < 30 and check that you agree with Figure 12.5. Plot the 
ten cycles 40 < t < 50 and find the period of the long-term motion. 

12.10 ** [Computer] Explore the behavior of the DDP with the same parameters as in Problem 12.8, 
but with several different initial conditions. For example, you might keep 0(0) = 0 but try various 
different values for 0(0) between — Tt and jt. You will find that the initial behavior varies quite a lot 
according to the initial conditions, but the long-term motion is the same in all cases (as long as you 
remember that values of 0 that differ by multiples of 2jt represent the same position). 

12.11 ** Test how well the values of the thresholds y n given in Table 12.1 fit the Feigenbaum relation 
(12.17) as follows: (a) Assuming that the Feigenbaum relation is exactly true use it to prove that 
(y n+ i — y n ) • (l/5)” —1 (y 2 ~ fi) ar *d, hence, that a plot of ln(y„ +1 — y n ) against n should be a straight 
line with slope — In S . (b) Make this plot for the three differences of Table 12.1. How well do your points 
seem to bear out our prediction? Fit a line to your plot (either graphically or using a least squares fit) and 
find the slope and hence the Feigenbaum number 5. [You would not expect to get very good agreement 
with the known value (12.18) for two reasons: You have only three points to plot and the Feigenbaum 
relation (12.17) is only approximate, except in the limit of large n. Under the circumstances, you will 
find the agreement is remarkable.] 

12.12 ** Here is another way to look at the Feigenbaum relation (12.17): (a) Assuming that (12.17) is 
exactly true, prove that the thresholds y n approach a finite limit y c and then prove that y n = y c — K/S n , 
where K is a constant. This means that a plot of y n against 8~ n should be a straight line, (b) Using the 
known value (12.18) of 8 and the four values of y n in Table 12.1 make this plot. Does it seem to fit our 
prediction? The vertical intercept of your graph should be y c . What is your value and how well does it 
agree with (12.20)? 

section 12.5 Chaos and Sensitivity to Initial Conditions 

12.13 ★ You can see in Figure 12.13 that for y = 1.105, the separation of two identical pendulums with 
slightly different initial conditions increases exponentially. Specifically A0 starts out at 10 ~ 4 and by 
t — 14.5 it has reached about 1. Use this to estimate the Liapunov exponent X as defined in (12.26), 
A0(t) ~ Ke Xt . Your answer should confirm that X > 0 for chaotic motion. 

12.14 ** [Computer] Numerically solve the equation of motion (12.11) for a DDP with drive strength 
y = 1.084, and the following other parameters: drive frequency to = 2tt, natural frequency to 0 = 1.5&>, 
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damping constant = co 0 / 4, and initial conditions 0(0) = 0(0) = 0. Solve for the first seven drive 
cycles (0 < ? < 7) and call your solution Solve again for all the same parameters except that 
0(0) = 0.00001 and call this solution 0 2 (?).Let A 0(?) = 0 2 (?) — 4>\(t) and make a plot of log |A0(?)| 
against ?. With this drive strength the motion is chaotic. Does your plot confirm this? In what sense? 

12.15 ** [Computer] Numerically solve the equation of motion (12.11) for a DDP with the following 
parameters: drive strength y = 0.3, drive frequency oj = In, natural frequency co 0 = 1.5m, damping 
constant /? = a> 0 /4, and initial conditions 0(0) = 0(0) = 0. Solve for the first five drive cycles (0 < 
t < 5) and call your solution 0i(?). Solve again for all the same parameters except that 0(0) = 1 (that 
is, the initial angle is one radian) and call this solution 0 2 (r). Let A0(?) = 0 2 (?) — <p](t) and make a 
plot of log | A0 (?) | against t . Does your plot confirm that A0 (?) goes to zero exponentially? Note: The 
exponential decay continues indefinitely, but A0(?) eventually gets so small that it is smaller than the 
rounding errors, and the exponential decay cannot be seen. If you want to go further, you will probably 
need to crank up your precision. 

12.16 ** Consider the chaotic motion of a DDP for which the Liapunov exponent is X = 1, with time 
measured in units of the drive period as usual. (This is very roughly the value found in Problem 12.13.) 
(a) Suppose that you need to predict 0(?) with an accuracy of 1/100 rad and that you know the initial 
value 0(0) within 10“ 6 rad. What is the maximum time ? max for which you can predict 0(?) within the 
required accuracy? This ? max is sometimes called the time horizon for prediction within a specified 
accuracy, (b) Suppose that, with a vast expenditure of money and labor, you manage to improve the 
accuracy of your initial value to 10 ‘ 9 radians (a thousand-fold improvement). What is the time horizon 
now (for the same required accuracy of prediction)? By what factor has ? max improved? Your results 
illustrate the difficulty of making accurate long-term predictions for chaotic motion. 

12.17** [Computer] In Figure 12.15, you can see that for y = 1.503 the DDP “tries” to execute a 
steady rolling motion changing by 2 n once each cycle, but that there is superposed an erratic wobbling 
and that the direction of the rolling reverses itself from time to time. For other values of y, the pendulum 
actually does approach a steady, periodic rolling, (a) Solve the equation of motion (12.11) for a drive 
strength y = 1.3 and all other parameters as in the first part of Problem 12.14, for 0 < ? < 8. Call your 
solution 0j(?) and plot it as a function of ?. Describe the motion, (b) It is hard to be sure that the motion 
is periodic based on this graph, because of the steady rolling through —2 n each cycle. As a better 
check, plot 0j(?) + 2nt against time. Describe what this shows. This kind of periodic rolling motion 
is sometimes described as phase-locked. 

12.18 ** [Computer] Since the rolling motion of Problem 12.17 is periodic (and hence not chaotic) 
we would expect the difference A0(?) between neighboring solutions (solutions of the same equation, 
but with slightly different initial conditions) to decrease exponentially. To illustrate this, do part (a) of 
Problem 12.17 and then find the solution of the same problem except that 0(0) = 1. Call this second 
solution 0 2 (?) and let A0(?) = 0 2 (?) — 0j(?). Make a plot of log | A0 (?)| against ? and comment. 

section 12.7 State-Space Orbits 

12.19 * Consider an undamped, undriven simple harmonic oscillator — a mass m on the end of a spring 
whose force constant is k. (a) Write down the general solution x(t) for the position as a function of 
time ?. Use this to sketch the state-space orbit, showing the motion of the point [x(?), i(?)] in the two- 
dimensional state space with coordinates (x, i). Explain the direction in which the orbit is traced as 
time advances, (b) Write down the total energy of the system and use conservation of energy to prove 
that the state-space orbit is an ellipse. 
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12.20* Consider a weakly-damped, but undriven oscillator as described by Equation (5.28). The 
general motion is given by (5.38). (a) Use this to sketch the state-space orbit, showing the movement of 
(x, x) for 0 < t < 10 with the parameters 8 = 0, oq = 2n, and = 0.5. (b) This system has a unique 
stationary attractor. What is it? Explain it, both with reference to your sketch and in terms of energy 
conservation. 

section 12.9 The Logistic Map 

12.21 ** Here is an iterated map that is easily studied with the help of your calculator: Let x t+l = /(x f ) 
where /(x) = cos(x). If you choose any value for x 0 , you can find x), x 2 , x 3 , • • • by simply pressing 
the cosine button on your calculator over and over again. (Be sure the calculator is in radians mode.) 
(a) Try this for several different choices of x 0 , finding the first 30 or so values of x t . Describe what 
happens, (b) You should have found that there seems to be a single fixed attractor. What is it? Explain 
it, by examining (graphically, for instance) the equation for a fixed point fix*) = x* and applying our 
test for stability [namely, that a fixed point x* is stable if |/'(x*)| < 1], 

12.22** Consider the iterated map x t+l = f(x t ) where /(x) = x 2 . (a) Show that it has exactly two 
fixed points of which just one is stable. What are they? (b) Show that x t approaches the stable fixed 
point if and only if — 1 < x 0 < 1. The interval — 1 < x 0 < 1 is called a basin of attraction since all 
sequences x 0 , x l5 • • • that start in the “basin” are attracted to the same attractor, (c) Show that x t -» oo 
if and only if |x 0 | > 1. (Thus we could say the map has a second stable fixed point at x = oo and the 
basin of attraction for this fixed point is the set |x 0 | > 1.) For chaotic systems, the basins of attraction 
can be much more complicated than these examples and are often fractals. 

12.23** [Computer] Consider the sine map x r+1 = f(x t ) where /(x) = r sin(7rx). The interesting 
behavior of this map is for 0 < x < 1 and 0 < r < 1, so restrict your attention to these ranges, (a) Using 
a plot analogous to Figure 12.35, discuss the fixed points of this map. Show that the map has either 
one or two fixed points, depending on the value of r. Show that when r is small there is just one fixed 
point, which is stable, (b) At what value of r (call it r 0 ) does the second fixed point appear? Show that 
r Q is also the value of r at which the first fixed point becomes unstable, (c) As r increases, the second 
fixed point eventually becomes unstable. Find numerically the value r x at which this occurs. 

12.24** [Computer] Consider the sine map of Problem 12.23. Using a progammable calculator (or 
a computer) you can easily find the first ten or twenty values of x t for any chosen inital value x 0 . 
Taking x 0 = 0.3, calculate the first 10 values of x t for each of the following values of the parameter 
r : (a) r — 0.1, (b) r = 0.5, (c) r = 0.78. In each case plot your results (x f against t) and describe the 
long-term attractor. If you did Problem 12.23, are your results here consistent with what you proved 
there? 

12.25 ** [Computer] The sine map of Problem 12.23 exhibits period-doubling cascades just like the 
logistic map. To illustrate this, take x 0 — 0.8 and find the first twenty values x t for each of the following 
values of the parameter r: (a) r = 0.60, (b) r = 0.79, (c) r = 0.85, and (d) r = 0.865. Plot your results 
(as four separate plots) and comment. 

12.26 ** The appearance of a two-cycle of the logistic map at the exact moment when the one-cycle 
becomes unstable follows directly from the behavior of the graphs of /(x) and g(x) = /(/(x)) as 
shown in Figures 12.38 and 12.39. The crucial point is that the function /(x) is a simple arch that 
gets steadily higher as we increase the control parameter r from 0; at the same time, /(/(x)) starts 
out as a simple arch which is lower than fix), but developes two maxima [higher than fix)] with a 
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minimum [lower than /(x)] in between. Explain clearly in words why this behavior of f(fix)) follows 
inevitably from that of fix). [Hint: Since f (x) is symmetric about x = 0.5, you need only consider 
its behavior as x runs from 0 to 0.5. The advantage of this argument is that it applies to any map of the 
form fix.) = rcp(x) where 0(x) has a single symmetric arch (such as the sine map of Problem 12.23).] 


12.27 ★* The two-cycles of the map f(x) correspond to fixed points of the second-iterate g(x) = 
/(/(x)). Thus the two values shown as x a and x b in Figure 12.37 are roots of the equation /(/(x)) = x. 
In the case of the logistic map this is a quartic equation, which is not too hard to solve: (a) Verify that 
for the logistic function fix) 

X - /(/(x)) = rx ^x - -- |^r 2 x 2 - r(r + l)x + r + lj . (12.61) 

Thus the fixed-point equation x — /(/(x)) = 0 has four roots. The first two are x = 0 and x = 
(r — 1 )/r. Explain these and show that the other two roots are 


x a ,x b 


r + 1 ± V(r + l)(r - 3) 
2r 


(12.62) 


Explain how you know that these are the two points of a two-cycle, (b) Show that for r < 3 these roots 
are complex and hence that there is no real two-cycle, (c) For r > 3 these two roots are real and there 
is a real two-cycle. Find the values of x a and x b for the case r = 3.2 and verify the values shown in 
Figure 12.37. 

12.28 ** Equation (12.62) in Problem 12.27 gives the two fixed points of the two-cycle of the logistic 
map. One can observe this two-cycle only if it is stable, which will happen if |g'(x)| < 1, where g(x) 
is the double map g(x) = /(/(x)) and x is either of the values x a or x b . (a) Combine (12.62) with 
Equation (12.57) to find g'(x a ). [Notice that because (12.57) is symmetric in x a and x b , you will get 
the same result whether you use x a or x b ; that is, the two points necessarily become stable or unstable 
at the same time.] (b) Show that the two-cycle is stable for 3 < r < 1 + V6. This establishes that the 
threshold at which period 2 is replaced by period 4 is r 2 = 1 + \/6 = 3.449. 

12.29 *★ [Computer] The thresholds r n for period doubling of the logistic map are given by Equation 
(12.58). These should satisfy the Feigenbaum relation (12.17), at least in the limit that n —> oo (with 
y replaced by r, of course). Test this claim as follows: (a) If you have not done Problem 12.11 
prove that the Feigenbaum relation (if exactly true) implies that (r„ +1 — r n ) = K/8 n . (b) Make a plot 
of ln(r„ +1 — r n ) against n. Find the best-fit straight line to the data and from its slope predict the 
Feigenbaum constant. How does your answer compare with the accepted value 8 =4.67? 

12.30 *-* [Computer] The chaotic evolution of the logistic map shows the same sensitivity to initial 
conditions that we met in the DDP. To illustrate this do the following: (a) Using a growth rate r = 2.6, 
calculate x t for 1 < t < 40 starting from x 0 = 0.4. Repeat but with the inital condition x' Q = 0.5 (the 
prime is just to distinguish this second solution from the first — it does not denote differentiation) and 
then plot log |x' - x t \ against t. Describe the behavior of the difference x' - x t . (b) Repeat part (a) 
but with r = 3.3. In this case, the long term evolution has period 2. Again describe the behavior of the 
difference x' — x t . (c) Repeat parts (a) and (b), but with r = 3.6. In this case, the evolution is chaotic, 
and we expect the difference to grow exponentially; therefore, it is more interesting to take the two inital 
values much closer together. To be definite take x 0 = 0.4 and x' Q = 0.400001. How does the difference 
behave? 
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12.31 *** [Computer] When the evolution of the logistic map is non-chaotic, two solutions with the 
same r that start out sufficiently close, will converge exponentially. (This was illustrated in Problem 
12.30.) This does not mean that any two solutions with the same r will converge, (a) Repeat Problem 
12.30(a), with all the same parameters except r = 3.5 (a value for which we know the long-term motion 
has period 4). Does x' — x t approach zero? (b) Now do the same exercise but with the initial conditions 
x 0 = 0.45 and x' Q = 0.5. Comment. Can you explain why, if the period is greater than 1, it is impossible 
that the difference x' t — x t go to zero for all choices of initial conditions? 

12.32 *** [Computer] Make a bifurcation diagram for the logistic map, in the style of Figure 12.41 
but for the range 0 < r < 3.55. Take x 0 = 0.1. Comment on its main features. [Hint: Start by using a 
very small number of points, perhaps just r going from 0 to 3.5 in steps of 0.5 and t going from 51 to 
54. This will let you calculate for each of the values of r individually, and get the feel of how things 
work. To make a good diagram, you will then need to increase the number of points (r going from 0 
to 3.55 in steps of 0.025, and t from 51 to 60, perhaps), and you will certainly need to automate the 
calculation of the large number of points.] 

12 . 33 *** [Computer] Reproduce the logistic bifurcation diagram of Figure 12.41 for the range 
2.8 < r < 4. Take x Q = 0.1. [Hint: To make Figure 12.41 I used about 50,000 points, but you cer¬ 
tainly don’t need to use that many. In any case, start by using a very small number of points, perhaps 
just r going from 2.8 to 3.4 in steps of 0.2 and t going from 51 to 54. This will let you calculate for 
each of the values of r individually, and get the feel of how things work. To make a good diagram, you 
will then need to increase the number of points (r going from 2.8 to 4 in steps of 0.025, and t from 
500 to 600, perhaps), and you will certainly need to automate the calculation of the large number of 
points.] 

12.34*** [Computer] Make a bifurcation diagram for the sine map of Problems 12.23 and 12.25. 
This should resemble Figure 12.41 but for the range 0.6 < r < 1. Take x 0 = 0.1. Comment on its main 
features. [Hint: Start by using a very small number of points, perhaps just r going from 0.6 to 0.8 in steps 
of 0.05 and t going from 51 to 54. This will let you calculate for each of the values of r individually, and 
get the feel of how things work. To make a good diagram, you will then need to increase the number of 
points (r going from 0.6 to 1 in steps of 0.005, and t from 400 to 500, perhaps), and you will certainly 
need to automate the calculation of the large number of points.] 



CHAPTER 


Hamiltonian Mechanics 


In the first six chapters of this book, we worked entirely with the Newtonian form 
of mechanics, which describes the world in terms of forces and accelerations (as 
related by the second law) and is primarily suited for use in Cartesian coordinate 
systems. In Chapter 7, we met the Lagrangian formulation. This second formulation 
is entirely equivalent to Newton’s, in the sense that either one can be derived from the 
other, but the Lagrangian form is considerably more flexible with regard to choice 
of coordinates. The n Cartesian coordinates that describe a system in Newtonian 
terms are replaced by a set of n generalized coordinates q h q 2 , ■ ■ ■, q n , and Lagrange’s 
equations are equally valid for essentially any choice of q h q 2 , ■ ■ ■ ,q n . As we have 
seen on many occasions, this versatility allows one to solve many problems much more 
easily using Lagrange’s formulation. The Lagrangian approach also has the advantage 
of eliminating the forces of constraint. On the other hand, the Lagrangian method 
is at a disadvantage when applied to dissipative systems (for example, systems with 
friction). By now, I hope you feel comfortable with both formulations and are familiar 
with the advantages and disadvantages of each. 

Newtonian mechanics was first expounded by Newton in his Principia Mathemat- 
ica, published in 1687. Lagrange published his formulation in his book Mechanique 
Analytique in 1788. In the early nineteenth century, various physicists, including La¬ 
grange, developed yet a third formulation of mechanics, which was put into a complete 
form in 1834 by the Irish mathematician William Hamilton (1805-1865) and has come 
to be called Hamiltonian mechanics. It is this third formulation of mechanics that is 
the subject of this chapter. 

Like the Lagrangian version, Hamiltonian mechanics is equivalent to Newtonian 
but is considerably more flexible in its choice of coordinates. In fact, in this respect it 
is even more flexible than the Lagrangian approach. Where the Lagrangian formalism 
centers on the Lagrangian function £, the Hamiltonian approach is based on the 
Hamiltonian function IK (which we met briefly in Chapter 7). For most of the systems 
we shall meet, IK is just the total energy. Thus one advantage of Hamilton’s formalism 
is that it is based on a function, IK, which (unlike the Lagrangian L) has a clear 
physical significance and is frequently conserved. The Hamiltonian approach is also 521 
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especially well suited for handling other conserved quantities and implementing 
various approximation schemes. It has been generalized to various different branches 
of physics; in particular, Hamiltonian mechanics leads very naturally from classical 
mechanics into quantum mechanics. For all these reasons, Hamilton’s formulation 
plays an important role in many branches of modem physics, including astrophysics, 
plasma physics, and the design of particle accelerators. Unfortunately, at the level of 
this book, it is hard to demonstrate many of the advantages of Hamilton’s version over 
Lagrange’s, and in this chapter I shall have to ask you to be content with learning the 
former as just an alternative to the latter — an alternative several of whose advantages 
I can mention but not explore in depth. If you go on to take a more advanced course 
in classical mechanics or to study quantum mechanics, you will certainly meet many 
of this chapter’s ideas again. 


13.1 The Basic Variables 


Because the Hamiltonian version of mechanics is closer to the Lagrangian than to the 
Newtonian and arises naturally from the Lagrangian, let us start by reviewing the main 
features of the latter, which centers on the Lagrangian function £. For most systems 
of interest, £ is just the difference of the kinetic and potential energies, £ = T — U, 
and in this chapter we shall confine attention to systems for which this is the case. The 
Lagrangian £ is a function of the n generalized coordinates q h ■ ■ ■, q„, their n time 
derivatives (or generalized velocities) q w - - ■, q n , and, perhaps, the time: 

L=L{q x ,---,q n ,q l ,---,q n ,t) = T-U. (13.1) 


The n coordinates (q h • • •, q n ) specify a position or “configuration” of the system, 
and can be thought of as defining a point in an n-dimensional configuration space. 
The 2 n coordinates (q x , ■ ■ -, q n , q h ■ ■ •, q n ) define a point in state space, and specify 
a set of initial conditions (at any chosen time t 0 ) that determine a unique solution of 
the n second-order differential equations of motion, Lagrange’s equations. 


9£ _ d_ 3£ 
3 q t dt dq t 


«]. 


(13.2) 


For each set of initial conditions, these equations of motion determine a unique path 
or “orbit” through state space. 

You may also recall that we defined a generalized momentum given by 


3 £ 

Pi = T~- 


(13.3) 


If the coordinates (q h , q n ) are in fact Cartesian coordinates, the generalized 
momenta p t are the corresponding components of the usual momenta; in general, p l 
is not actually a momentum, but does, as we have seen, play an analogous role. The 
generalized momentum p t is also called the canonical momentum or the momentum 
conjugate to q t . 
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In the Hamiltonian approach, the central role of the Lagrangian £ is taken over by 

the Hamiltonian function, or just Hamiltonian, "K defined as 


(13.4) 

ir-.\ 

The equations of motion, which we shall derive in the next two sections, involve 
derivatives of Di rather than £ as in Lagrange’s equations. We met the Hamiltonian 
function briefly in Section 7.8, where we proved that, provided the generalized 
coordinates (q x , ■ ■ •, q n ) are “natural” (that is, the relation between the q 's and the 
underlying Cartesian coordinates is time independent), TC is just the total energy of 
the system and is, therefore, familiar and easy to visualize. 

There is a second important difference between the Lagrangian and Hamiltonian 
formalisms. In the former we label the state of the system by the 2 n coordinates 

( 4 ,, •••,<?,„r/,, •••,4), [Lagrange] (13.5) 


whereas in the latter we shall use the coordinates 


(q h ---,q n ,Pi,---,p n ), [Hamilton] (13.6) 

consisting of the n generalized positions and the n generalized momenta (instead of 
the generalized velocities). This choice of coordinates has several advantages, a few 
of which I shall sketch and some of which you will have to take on faith. 

Just as we can regard the In coordinates (13.5) of the Lagrange approach as 
defining a point in a 2n-dimensional state space, so we can regard the In coordinates 
(13.6) of the Hamiltonian approach as defining a point in a 2n-dimensional space, 
which is usually called phase space. 1 Just as the Lagrange equations of motion (13.2) 
determine a unique path in state space starting from any initial point (13.5), so (we shall 
see) Hamilton’s equations determine a unique path in phase space starting from any 
initial point (13.6). A succinct way to state some of the advantages of the Hamiltonian 
formalism is that phase space has certain geometrical properties that make it more 
convenient than state space. 

Like Lagrange’s approach, Hamilton’s is best suited to systems that are subject to 
no frictional forces. Accordingly, I shall assume throughout this chapter that all the 
forces of interest are conservative or can at least be derived from a potential energy 
function. Although this restriction excludes many interesting mechanical systems, it 
still includes a huge number of important problems, especially in astrophysics and at 
the microscopic — atomic and molecular — level. 


1 Many authors use the names “state space” and “phase space” interchangeably, but it is conve¬ 
nient to have different names for the different spaces, and I shall reserve “state space” for the space 
of positions and generalized velocities, and “phase space” for that of positions and generalized 
momenta. 
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13.2 Hamilton’s Equations for One-Dimensional Systems 


To minimize notational complications, I shall first derive Hamilton’s equations of 
motion for a conservative, one-dimensional system with a single “natural” generalized 
coordinate q. For example, you could think of a simple, plane pendulum, in which 
case q could be the usual angle <p, or a bead on a stationary wire, in which case q could 
be the horizontal distance x along the wire. For any such system, the Lagrangian is a 
function of q and q, that is, 

L=L(q,q) = T(q,q)-U(q). (13.7) 

Recall that, in general, the kinetic energy can depend on q as well as q, whereas, for 
conservative systems, the potential energy depends only on q. For example, for the 
simple pendulum (mass m and length L), 

£ = £(<£, <p) = jmL 2 <p 2 — mgL{ 1 — cos0), (13.8) 

where, in this case, the kinetic energy involves only <p, not (p. For a bead sliding on 
a frictionless wire of variable height y = fix), we saw in Example 11.1 (page 438) 
that 


£ = £(.*-,i) = 7'- U = \m[ 1 + f'(x) 2 )x 2 - mgf{x). (13.9) 

Here the jc dependence of the kinetic energy came about when we rewrote the v 
in ±mv 2 in terms of the horizontal distance x. The two examples (13.8) and (13.9) 
illustrate a general result that we proved in Section 7.8 that the Lagrangian for a 
conservative system with “natural” coordinates (and in one dimension here) has the 
general form 


£ = L(q, q) = T — U — \A{q)q 2 - U(q). (13.10) 

Notice that, while the kinetic energy can depend on q in a complicated way, through 
the function A(q), its dependence on q is just through the simple quadratic factor q 2 . 
As you can easily check by writing it down, Lagrange’s equation for this Lagrangian 
is automatically a second-order differential equation for q. 

The Hamiltonian is defined by (13.4), which in one dimension reduces to 

‘K^pq-L. (13.11) 

In the discussion of Section 7.8,1 offered some reason why one might perhaps expect 
a function defined in this way to be an interesting function to study. For now, let us 
just accept the definition as an inspired suggestion by Hamilton — a suggestion whose 
merit will appear as we proceed. 2 Given the form (13.10) of £, we can calculate the 


2 Actually, the change from £ to 7f as the object of primary interest is an example of a 
mathematical maneuver, called a Legendre transformation, which plays an important role in several 
fields, most notably thermodynamics. For example, the change from the thermodynamic internal 
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generalized momentum p as 

P = ~T = A(q)q (13.12) 

dq 

so that pq = A(q)q 2 = 2 T. Substituting into (13.11), we find that 
IK = pq - L = IT - (T - U) = T + U. 


That is, the Hamiltonian IK for the “natural” system considered here is precisely the 
total energy — the same result we proved for any “natural” system in any number of 
dimensions in Section 7.8. 

The next step in setting up the Hamiltonian formalism is perhaps the most subtle. 
In the Lagrangian approach, we think of £ as a function of q and q, as is indicated 
explicitly in (13.10). Similarly, (13.12) gives the generalized momentum p in terms 
of q and q. However, we can solve (13.12) for q in terms of q and p: 

q = p/A(q) = q(q,p), (13.13) 

say. With q expressed as a function of q and p, let us now look at the Hamiltonian. 
Wherever q appears in IK, we can replace it by q(q, p), and IK becomes a function 
of q and p. In all its horrible detail, (13.11) becomes 

Ji(q,p) = pq(q,p) - L(q, q(q, p)). (13.14) 


Our final step is to get Hamilton’s equations of motion. To find these, we just 
evaluate the derivatives of I K(q, p) with respect to q and p. First, using the chain 
rule, we differentiate (13.14) with respect to q: 

9i k _ 

dq P dq [_ dq dq dq J 


Now, in the third term on the right, you will recognize that 9£ /dq = p. Thus, the first 
and third terms on the right cancel one another, leaving just 


9IK _ _9£ _ d_ dC _ _d_ 
dq dq dt dq dt ^ ^ 


(13.15) 


where the second equality follows from the Lagrange equation (13.2). This equation 
gives the time derivative of p (that is, p) in terms of the Hamiltonian IK and is the 
first of the two Hamiltonian equations of motion. Before we discuss it, let’s derive the 
second one. 

Differentiating (13.14) with respect to p and using the chain rule, we find 


9IK 

dp 



d£dq_ 
dq dp 


(13.16) 


energy U to enthalpy H is a Legendre transformation, closely analogous to Hamilton’s change 
from £ to H. 
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since the second and third terms in the middle expression cancel exactly. This is the 
second of the Hamiltonian equations of motion and gives q in terms of the Hamiltonian 
W. Collecting them together (and reordering a bit), we have Hamilton’s equations 
for a one-dimensional system: 


. aw 

q %p 


and 


p = ~— 


(13.17) 


In the Lagrangian formalism, the equation of motion of a one-dimensional system is 
a single second-order differential equation for q. In the Hamiltonian approach, there 
are two first-order equations, one for q and one for p. Before we extend this result to 
more general systems or discuss any advantages the new formalism may have, let us 
look at a couple of simple examples. 


example 13.1 A Bead on a Straight Wire 

Consider a bead sliding on a frictionless rigid straight wire lying along the 
x axis, as shown in Figure 13.1. The bead has mass m and is subject to a 
conservative force, with corresponding potential energy U (x). Write down 
the Lagrangian and Lagrange’s equation of motion. Find the Hamiltonian and 
Hamilton’s equations, and compare the two approaches. 

Naturally, we take as our generalized coordinate q the Cartesian x. The 
Lagrangian is then 

L(x,x) = T - U = \mx 2 — U(x). 


The corresponding Lagrange equation is 

ac _ d_d ,c 

dx dt dx 


dU 


-= mx 

dx 


(13.18) 


which is just Newton’s F = ma, as we would expect. 

To set up the Hamiltonian formalism, we must first find the generalized 
momentum, 


Figure 13.1 A bead of mass m sliding on a frictionless straight wire. 
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As expected, this is just the conventional mv momentum. This equation can be 
solved to give x = p/m, which can then be substituted into the Hamiltonian to 
give 

= \£L- UM ],£ + um 

m |_2 m J 2m 

which you will recognize as the total energy, with the kinetic term jmi 2 rewrit¬ 
ten in terms of momentum as p 2 /(2m). Finally, the two Hamilton equations 
(13.17) are 


d% _ p 
dp m 


and 


(M _ dU 
dx dx 


The first of these is, from a Newtonian point of view, just the traditional definition 
of momentum, and, when we substitute this definition into the second equation, 
it gives us back mx = —dU/dx again. As had to be the case, Newton, Lagrange, 
and Hamilton all lead us to the same familiar equation. In this very simple 
example, neither Lagrange nor Hamilton has any visible advantage over Newton. 


example 13.2 Atwood’s Machine 

Set up the Hamiltonian formalism for the Atwood machine, first shown as Figure 
4.15 and shown again here as Figure 13.2. Use the height x of m x measured 
downward as the one generalized coordinate. 

The Lagrangian is L = T — U, where, as we saw in Example 7.3 (page 255), 


T = \{m x + m 2 )x 2 and U = — (m x — m 2 )gx. (13.19) 



Figure 13.2 An Atwood machine consisting of two masses, m x and 
m 2 , suspended by a massless inextensible string that passes over a 
massless, frictionless pulley. Because the string’s length is fixed, the 
position of the whole system is specified by the distance x of m x below 
any convenient fixed level. 
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| We can calculate the Hamiltonian IK either as IK = px — £ or, what is usually 
) a little quicker (provided it is true 3 ), as IK = T + U. Either way, we must first 
find the generalized momentum p = 3£/3i or, since U does not involve x, 

1 dT , 

l P = — = (»*i + m 2 )x. 

J 3* 

| We solve this to give i in terms of p as x = p/(m l + m 2 ), which we substitute 
into IK to give IK as a function of x and p: 

l % = T + U = -^-(in, - m 2 )gx. (13.20) 

2(m, + m 2 ) 

j We can now write down the two Hamilton equations of motion (13.17) as 


m _ p 

dp m l + m 2 


and 


P 


3IK 

dx 


= (m, - m 2 )g. 


Again, the first of these is just a restatement of the definition of the generalized 
momentum and, when we combine this with the second, we get the well-known 
result for the acceleration of the Atwood machine, 

m, — m 2 

* = — - ~g- 

m \ ' m 2 


These two examples illustrate several of the general features of the Hamiltonian 
approach: Our first task is alway to write down the Hamiltonian 3K (just as in the 
Lagrangian approach the first task is to write down £). In the Hamiltonian approach 
there are usually a couple of extra steps, which are to write down the generalized 
momentum, to solve the resulting equation for the generalized velocity, and to express 
IK as a function of position and momentum. Once this is done, one can just turn 
the handle and crank out Hamilton’s equations. In general, there is no guarantee 
that the resulting equations will be easy to solve, but it is a wonderful property of 
Hamilton’s approach (like Lagrange’s) that it provides an almost infallible way to 
find the equations of motion. 


13.3 Hamilton’s Equations in Several Dimensions 


Our derivation of Hamilton’s equations for a one-dimensional system is easily ex¬ 
tended to multidimensional systems. The only real problem is that the equations can 
become badly cluttered with indices, so, to minimize the clutter, I shall use the ab¬ 
breviation introduced in Section 11.5: The configuration of an n -dimensional system 


3 Remember that this second expression is true provided the generalized coordinate is “natural,” 
that is, the relation between the generalized coordinate and the underlying Cartesian coordinates is 
independent of time — a condition which is certainly met here (Problem 13.4). 
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is given by n generalized coordinates q\,---, q n , which I shall represent by a single 
bold-face q: 


q = (?i. •••,$„)• 

Similarly, the generalized velocities become q = (q h ■ ■ ■ ,q n ), and the generalized 
momenta are 


p = (pi, ....pj. 


It is important to remember that for now a bold-face q or p is not necessarily a 
three-dimensional vector. Rather q and p are n -dimensional vectors in the space of 
generalized positions or generalized momenta. 

Hamilton’s equations follow directly from Lagrange’s equations in their standard 
form. Thus to prove the former, we have only to assume the truth of the latter. To 
be specific, however, I shall make the same assumptions that we used in Chapter 7: 
I shall assume that any constraints are holonomic; that is, the number of degrees of 
freedom is equal to the number of generalized coordinates. I shall also assume that the 
nonconstraint forces can be derived from a potential energy function, though it is not 
essential that they be conservative (that is, the potential energy is allowed to depend on 
t). The equations that relate the N underlying Cartesian coordinates r 1; • • •, % to the 
n generalized coordinates q\,---,q„ can depend on time; that is, it is not essential that 
the generalized coordinates be “natural.” These assumptions are enough to guarantee 
that the standard Lagrangian formalism applies, and will let us derive from it the 
Hamiltonian one. Thus our starting point is that there is a Lagrangian 

L =£(q, q ,t) = T -U 


and that the evolution of our system is governed by the n Lagrange equations. 


9£_d_9L 1 n 

3 q { dt 3 q t 

We shall define the Hamiltonian function as in (13.4), 


(13.21) 


M = X>9; - 


(13.22) 


where the generalized momenta are defined by 


3£(q, q, t ) 
Hi 




(13.23) 


as in (13.3). Just as in the one-dimensional case, our next step is to express the 
Hamiltonian as a function of the 2 n variables q and p. To this end, note that we can 
view the equations (13.23) as n simultaneous equations for the n generalized velocities 
q. We can in principle solve these equations to give the generalized velocities in terms 
of the variables p, q, and t: 


ki = qfa i. ■ ■ • ,q n > Pb • • • > Pn> 0 »] 
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or, more succinctly 


q = q(q, p,f). 

We can now eliminate the generalized velocities from our definition of the Hamilto¬ 
nian to give (again in agonizing detail) 

K = lK(q, p ,t) = Y^ PtqM, P, 0 - £(q, q(q, P, 0 , 0 - (13.24) 

i=i 

The derivation of Hamilton’s equations now proceeds very much as in one dimen¬ 
sion, and I shall leave you to fill in the details (Problem 13.15). Following the same 
steps as led from (13.14) to (13.17), we differentiate K with respect to q t and then p h 
and this leads to Hamilton’s equations: 


. bk A . bk 

q , _ _ and Pi = 

Bp ; Bq, 




(13.25) 


Notice that for a system with n degrees of freedom, the Hamiltonian approach gives 
us 2 n first-order differential equations, instead of the n second-order equations of 
Lagrange. 

Before we discuss an example of Hamilton’s equations, there is one more derivative 
of K to consider, its derivative with respect to time. This is actually quite subtle. The 
function IK(q, p, t) could vary with time for two reasons: First, as the motion proceeds, 
the 2 n coordinates (q, p) vary, and this could cause K(q, p, t) to change; in addition, 
K(q, p, t) may have an explicit time dependence, as indicated by the final argument 
t, and this also can make K vary with time. Mathematically, this means that dK/dt 
contains In + 1 terms, as follows: 


dK 

dt 



+ 


m 

dpi 


Pi 


+ 


95f 
at ' 


(13.26) 


It is important to understand the difference between the two derivatives of ‘K in this 
equation. The derivative on the left, d'K/dt (sometimes called the total derivative), 
is the actual rate of change of K as the motion proceeds, with all the coordinates 
q h ■ • ■, q n , Pi,---,p n changing as t advances. That on the right, BK/dt, is the partial 
derivative, which is the rate of change of K if we vary its last argument t holding all the 
other arguments fixed. In particular, if K does not depend explicitly on t, this partial 
derivative will be zero. Now it is easy to see that, because of Hamilton’s equations 
(13.25), each pair of terms in the sum of (13.26) is exactly zero, so that we have the 
simple result 

dK _ BK 
dt Bt 
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That is, CK varies with time only to the extent that it is explicitly time dependent. 
In particular, if CK does not depend explicitly on t (as is often the case), then CK is 
a constant in time; that is, the quantity CK is conserved. This is the same result we 
derived in Section 7.8. 4 

In Section 7.8, we proved a second result regarding time dependence: If the relation 
of the generalized coordinates q h ■ ■ ■, q n to the underlying rectangular coordinates 
is independent of t (that is, our generalized coordinates are “natural”), then the 
Hamiltonian CK is just the total energy, X = T + U. In the remainder of this chapter, 
I shall consider only the case that the generalized coordinates are “natural” and that 
CK is not explicitly time-dependent. Thus it will be true from now on that CK is the 
total energy and that total energy is conserved. 

Let us now work out an example of the Hamiltonian formalism for a system in two 
spatial dimensions. Unfortunately, like all reasonably simple examples, this does not 
exhibit any significant advantages of the Hamiltonian over the Lagrangian approach; 
rather, in this example, the Hamiltonian approach is just an alternative route to the 
same final equation of motion. 


example 13.3 Hamilton’s Equations for a Particle 
in a Central Force Field 

Set up Hamilton’s equations for a particle of mass m subject to a conservative 
central force field with potential energy U (r), using as generalized coordinates 
the usual polar coordinates r and <p. 

By conservation of angular momentum, we know that the motion is confined 
to a fixed plane, in which we can define the polar coordinates r and (p . The 
kinetic energy is given in terms of these generalized coordinates by the fam i liar 
expression 

T = \m{r 2 + r 2 4> 2 ). (13.27) 

Since the equations relating (r, (p ) to (x, y ) are time-independent, we know that 
CK = T + U, which we must express in terms of r and (p and the corresponding 
generalized momenta p r and p^. These latter are defined by the relation 5 p t — 
dL/d 4,- = dT/dq h which gives 

p r — 9T/dr = mr and p (f) = dT/d(p = mr 2 (p. (13.28) 

The momentum p r conjugate to r is just the radial component of the ordinary 
momentum rav, but, as we first saw in Section 7.1 [Equation (7.26)], the 


4 We proved there, in Equations (7.89) and (7.90), that CK is conserved if and only if L does not 
depend explicitly on the time. These two conditions {K not explicitly time dependent or £ likewise) 
are equivalent, for, as you can easily check, dK/dt = —dJC/dt. See Problem 13.16. 

5 In any problem where the potential energy U = U (q) is independent of the velocities q (as 
we are certainly assuming here), there is this small simplification that we can replace £ by T in the 
definition of p t . 
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momentum p^ conjugate to (f> is the angular momentum. We must next solve 
the two equations (13.28) to give the velocities r and (p in terms of the momenta 
Pr and pf 

Pr ; P<j> 

r = — and <p = — 
m mr 2 

We can now substitute these into (13.27) and we arrive at the Hamiltonian, 
expressed as a function of the proper variables, 

K = T + U = : ^-(p?+ I ^\+ U (r). (13.29) 

2 m\ r L I 

We can now write down the four Hamilton equations (13.25). The two radial 
equations are 


m _ p r 

d p r m 


and 


3IK _ Pj dU 
dr mr 3 dr 


(13.30) 


The first of these reproduces the definition of the radial momentum. If we 
substitute the first into the second, we obtain the familiar result that mr is the 
sum of the actual radial force {—dU /dr) plus the centrifugal force p 2 /mr 3 . [See 
Equations (8.24) and (8.26).] The two (p equations are 


dOi _ P<t> 
dp^ mr 2 


dM 

p * = -^ =0 - 


(13.31) 


The first of these reproduces the definition of p^. The second tells us what we 
already knew, that the angular momentum is conserved. As in the previous two 
examples, we see that the Hamiltonian formalism provides an alternative route 
to the same final equations of motion as we could find using either the Newtonian 
or Lagrangian approaches. 


This example illustrates the general procedure to be followed in setting up Hamil¬ 
ton’s equations for any given system: 

1. Choose suitable generalized coordinates, q\,---,q n - 

2. Write down the kinetic and potential energies, T and U, in terms of the ^’s and 
4’s. 

3. Find the generalized momenta p h • ■ •, p n . (We are now assuming our system 
is conservative, so U is independent of q t and we can use p { — dT!dq r In 
general, one must use p t = dL/dcp.) 

4. Solve for the q 's in terms of the p’s and q' s. 

5. Write down the Hamiltonian TC as a function of the p’s and q’s. [Provided 
our coordinates are “natural” (relation between generalized coordinates and 
underlying Cartesians is independent of time), IK is just the total energy 
IK — T + U, but when in doubt, use IK = p i q i — £. See Problems 13.11 
and 13.12.] 

6. Write down Hamilton’s equations (13.25). 
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If you look back over the last example, you will see that the solution followed all of 
these six steps, and the same will be true of all later examples and problems. Before 
we do another example, let us compare these six steps with the corresponding steps 
in the Lagrangian approach. To set up Lagrange’s equations, we follow the same first 
two steps (choose generalized coordinates and write down T and U ). Steps (3) and 
(4) are unnecessary, since we don’t have to know the generalized momenta, nor to 
eliminate the q's in favor of the p’s. Finally, one must carry out the analogs of (5) and 
(6); namely, write down the Lagrangian and Lagrange’s equations. Evidently, setting 
up the Hamiltonian approach involves two small extra steps [Steps (3) and (4) above] 
as compared to the Lagrangian. Although both steps are usually quite straightforward, 
this is undeniably a small disadvantage of Hamilton’s formalism. Now, here is another 
example. 


example 13.4 Hamilton’s Equations for a Mass on a Cone 

Consider a mass m which is constrained to move on the frictionless surface of 
a vertical cone p = cz (in cylindrical polar coordinates p, <fi, z with z > 0) in a 
uniform gravitational field g vertically down (Figure 13.3). Set up Hamilton’s 
equations using z and 0 as generalized coordinates. Show that for any given 
solution there are maximum and minimum heights z max and z min between which 
the motion is confined. Use this result to describe the motion of the mass on the 
cone. Show that for any given value of z > 0 there is a solution in which the 
mass moves in a circular path at fixed height z. 

Our generalized coordinates are z and 0, with p determined by the constraint 
that the mass remain on the cone, p = cz. The kinetic energy is therefore 

T = \m [p 2 + (p0) 2 + z 2 ] = \m [(c 2 + l)z 2 + (cz0) 2 ] . 



Figure 13.3 A mass m is constrained to move on the surface of the 
cone shown. For clarity, the cone is shown truncated at the height of 
the mass, although it actually continues upward indefinitely. 
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The potential energy is of course U = mgz. The generalized momenta are 


dT n dT , - . 

p z = — = m(c 2 + 1 )z and p& = —- = mc 2 z 2 <p. (13.32) 

3 z 3 0 

These are trivially solved for z and 0, and we can write down the Hamiltonian, 


Ji = T + U = - 


_1_ 

2m 


( c 2 + 1) c 2 z 2 


+ mgz. 


(13.33) 


Hamilton’s equations are now easily found: The two z equations are 

. p z , . 3 •K Pi ...... 

Z = T~= ( 2 V n and Pz^~— = —73 ~ m 8- (13.34) 

3 p z m(c z + 1) 3z mc^z* 

The two 0 equations are 


3TC p* 


and p =-—= 0. (13.35) 

30 


The last of these tells us, what we could well have guessed, that p^, which is 
I just the z component of angular momentum, is constant. 

! The easiest way to see that, for any given solution, z is confined between 
j two bounds, z min and z max , is to remember that the Hamiltonian function (13.33) 
i is equal to the total energy, and that energy is conserved. Thus, for any given 
1 solution, (13.33) is equal to a fixed constant E. Now, the function ‘K in (13.33) 
j is the sum of three positive terms, and as z -» oo the last term tends to infinity, 
j Since "K must equal the fixed constant E, there must be a z max which z cannot 
j exceed. In the same way, the second term in (13.33) approaches infinity as 
z —y 0: so there has to be a z min > 0 below which z cannot go. In particular, 
I this means that the mass can never fall all the way into the bottom of the cone 
i at z = 0. 6 The motion of the mass on the cone is now easy to describe. It moves 
1 around the z axis with constant angular momentum p^ = mc 2 z 2 (f>. Since p^ 
| is constant, the angular velocity 0 varies — increasing as z gets smaller and 
decreasing as z gets bigger. At the same time, the mass’s height z oscillates 
l up and down between z min and z max . (See Problems 13.14 and 13.17 for more 
| details.) 

f To investigate the possibility of a solution in which the mass stays at a fixed 
height z, notice that this requires that z = 0 for all time. This in turn requires 
that p z = 0 for all time, and hence p z = 0. From the second of the z equations 
(13.34), we see that p z = 0 if and only if 


p# = ±y/m 2 c 2 gz 3 . 


(13.36) 


6 Two comments: It is easy to see that the second term in (13.33) is related to the centrifugal 
force; thus, we can say that the mass is held away from the bottom of the cone by the centrifugal 
force. Second, the one exception to this statement is if the angular momentum p$ = 0; in this case, 
the mass moves up and down the cone in the radial direction (0 constant) and will eventually fall to 
the bottom. 
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If, for any chosen initial height z, we launch the mass with p z = 0 and equal 
to one of these two values (going either clockwise or counterclockwise), then 
since p z = 0, p z and hence z both remain zero, and the mass continues to move 
at its initial height around a horizontal circle. 


13.4 Ignorable Coordinates 


So far we have set up Hamilton’s formalism and have seen that it is valid whenever 
Lagrange’s is. The former enjoys almost all the advantages and disadvantages of the 
latter, when either is compared with the Newtonian formalism. But it is not yet clear 
that there are any significant advantages to using Hamilton instead of Lagrange, or vice 
versa. As I have already mentioned, in more advanced theoretical work, Hamilton’s 
approach has some distinct advantages, and, I shall try to give some feeling for a few 
of these advantages in the next four sections. 

In Chapter 7, we saw that if the Lagrangian £ happens to be independent of a 
coordinate q h then the corresponding generalized momentum p t is constant. [This 
followed at once from the Lagrange equation dL/dq^ = (d/dt)dL/dq h which can 
be rewritten as 3£/3g ( - = p { . Therefore, if dL/dq l = 0, it immediately follows that 
p t = 0.] When this happens, we say that the coordinate q t is ignorable. 

In the same way, if the Hamiltonian 3i is independent of q h it follows from the 
Hamilton equation p t = —d‘K/dq i that its conjugate momentum p t is a constant. We 
saw this in Equation (13.31) of the last example, where Ji was independent of </>, 
and the conjugate momentum p^ (actually the angular momentum) was constant. The 
results of this and the last paragraph are in fact the same result, since it is easy to prove 
that 3£ /dq i = —d‘H/dq i (Problem 13.22). Thus £ is independent of q t if and only if 
% is independent of q t . If a coordinate q, is ignorable in the Lagrangian then it is also 
ignorable in the Hamiltonian and vice versa. 

It is nevertheless true that the Hamiltonian formalism is more convenient for 
handling ignorable coordinates. To see this, let us consider a system with just two 
degrees of freedom and suppose that the Hamiltonian is independent of q 2 . This means 
that the Hamiltonian depends on only three variables, 

K*3t(q l ,p h p 2 ). (13.37) 

For example, the Hamiltonian (13.29) for the central force problem has this property, 
being independent of the coordinate 0. This means that p 2 = k, a constant that is 
determined by the initial conditions. Substitution of this constant into the Hamiltonian 
leaves 


Pi,k) 


which is a function of just the two variables q x and p h and the solution of the mo¬ 
tion is reduced to a one-dimensional problem with this effectively one-dimensional 
Hamiltonian. More generally, if a system with n degrees of freedom has an ignorable 
coordinate q h then solution of the motion in the Hamiltonian framework is exactly 
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equivalent to a problem with (n — 1) degrees of freedom in which q i and p t can be 
entirely ignored. If there are several ignorable coordinates, the problem is correspond¬ 
ingly further simplified. 

In the Lagrangian formalism, it is of course true that if q t is ignorable, then p { is 
constant, but this does not lead to the same elegant simplification: Supposing again that 
the system has two degrees of freedom and that q 2 is ignorable, then corresponding 
to (13.37) we would have 


£ — &(<h, 4b 4i)- 

Now, even though q 2 is ignorable and p 2 is a constant, it is not necessarily true that 
q 2 is constant. Thus the Lagrangian does not reduce cleanly to a one-dimensional 
function that depends only on q x and q A . [For example, in the central-force problem, 
the Lagrangian has the form £(r, r, 0), but, even though 0 is ignorable, 0 still varies 
as the motion proceeds, and the problem does not automatically reduce to a problem 
with one degree of freedom. 7 ] 


13.5 Lagrange’s Equations vs. Hamilton’s Equations 


For a system with n degrees of freedom, the Lagrangian formalism presents us with 
n second-order differential equations for the n variables q h ■ ■ •, q n . For the same 
system, the Hamiltonian formalism gives us 2 n first-order differential equations for 
the 2 n variables q h • • •, q n , p h ■ ■ ■, p n . That Hamilton could recast n second-order 
equations into 2 n first-order equations is no particular surprise. In fact it is easy to 
see that any set of n second-order equations can be recast in this way: For simplicity, 
let us consider the case that there is just one degree of freedom, so that Lagrange’s 
approach gives just one second-order equation for the one coordinate q . This equation 
can be written as 


f(q,q,q)=0 (13.38) 

where / is some function of its three arguments. Let us now define a second variable 

s=q. (13.39) 

In terms of this second variable, q = s and our original differential equation (13.38) 
becomes 


f(s,s,q) = 0. (13.40) 

We have now replaced the one second-order equation (13.38) for q with the two first- 
order equations (13.39) and (13.40) for q and s. 


1 In this particular case, the difficulty is fairly easy to circumvent. In the discussion of Section 
8.4, we wrote the Lagrange radial equation (8.24) in terms of the variable 0 and then rewrote it 
eliminating 0 in favor of = l, which is constant. 
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Evidently, the fact that Hamilton’s equations are first-order where Lagrange’s are 
second-order is no particular improvement. However, the specific form of Hamilton’s 
equations is in fact a big improvement. To see this, let us rewrite Hamilton’s equations 
in a more streamlined form. First we can rewrite the first n of Equations (13.25) as 

ch = ^ = fM,p) [i = l,---,n] (13.41) 

3 Pi 

where each f t is some function of q and p, and we can combine these n equations into 
a single n -dimensional equation 


q = f(q,p) (13.42) 

where the boldface f stands for the vector comprised of the n functions f i = d‘K/dp i . 
In the same way we can rewrite the n equations for the p t in a similar form: 

P = g(q>P) (13.43) 

where the boldface g stands for the vector comprised of the n functions g t = 
-BOC/dqj. Finally, we can introduce a 2n-dimensional vector 

z - (q, P) = (q\, ■ ■ ■, q n i Pb ■ ■ ■ > Pn)- (13.44) 

This phase-space vector or phase point z comprises all of the generalized coordinates 
and all of their conjugate momenta. Each value of z labels a unique point in phase 
space and identifies a unique set of initial conditions for our system. With this new 
notation, we can combine the two equations (13.42) and (13.43) into a single grand, 
2n -component equation of motion 


z = h(z) (13.45) 

where the function h is a vector comprising the 2 n functions f\, ■■■,/„ and g\, ■ ■ •, g n 
of Equations (13.42) and (13.43). 

Equation (13.45) expresses Hamilton’s equations as a first-order differential equa¬ 
tion for the phase-space vector z. Furthermore, it is a first-order equation with the 
especially simple form: 8 

(first derivative of z) = (function of z). 

A large part of the mathematical literature on differential equations is devoted to 
equations with this standard form, and it is a distinct advantage of the Hamiltonian 
formalism that Hamilton’s equations — unlike Lagrange’s — are automatically of this 
form. 

Our combining of the n position coordinates q with the n momenta p to form a 
single phase space vector z = (q, p) suggests a certain equality between the position 
and momentum coordinates in phase-space, and this suggestion proves correct. We 


8 If the Hamiltonian was explicitly time dependent, then (13.45) would take the (still very simple) 
form z = h(z ,t). The form (13.45) without any explicit time dependence is said to be autonomous. 
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have known since Chapter 7 that one of the strengths of the Lagrangian formalism 
is its great flexibility with respect to coordinates: Any set of generalized coordinates 
q = ( q h ,q n ) can be replaced by a second set Q = (Q h •••, Q n ), where each of 
the new Q t is a function of the original (q h • ••, q n ), 

Q = Q(q), (13.46) 

and Lagrange’s equations will be just as valid with respect to the new coordinates 
Q as they were with respect to the old q. 9 We can paraphrase this to say that La¬ 
grange’s equations are unchanged (or invariant) under any coordinate change in the 
n-dimensional configuration space defined by q = (qq, • • •, q n ). The Hamiltonian for¬ 
malism shares this same flexibility — Hamilton’s equations are invariant under any 
coordinate change (13.46) in configuration space. However, the Hamiltonian formal¬ 
ism actually has a much greater flexibility and allows for certain coordinate changes 
in the 2n -dimensional phase space. We can consider changes of coordinates of the 
form 


Q = Q(q,p) and P = P(q,p), (13.47) 

that is, coordinate changes in which both the q 's and the p’s are intermingled. 
If the equations (13.47) satisfy certain conditions, this change of coordinates is 
called a canonical transformation, and it turns out that Hamilton’s equations are 
invariant under these canonical transformations. Any further discussion of canonical 
transformations would carry us beyond the scope of this book, but you should be aware 
of their existence and that they are one of the properties that make the Hamiltonian 
approach such a powerful theoretical tool. 10 Problems 13.24 and 13.25 offer two 
examples of canonical transformations. 


13.6 Phase-Space Orbits 


One can view the phase-space vector z = (q, p) of (13.44) as defining the system’s 
“position” in phase space. Any point z 0 defines a possible initial condition (at any cho¬ 
sen time t Q ), and Hamilton’s equations (13.45) define a unique phase-space orbit or 
trajectory which starts from z 0 at t 0 and which the system follows as time progresses. 
Since phase space has 2 n dimensions, the visualization of these orbits presents some 
challenges unless n — 1. For example, for a single unconstrained particle in three di¬ 
mensions, n — 3, and the phase space is six-dimensional — not something that most 
of us can visualize easily. There are various techniques, such as the Poincare section 


9 Of course, this statement has to be qualified a little: The coordinates Q must be “reasonable” 
in the sense that each set Q determines a unique set q and vice versa, and the function Q(q) has to 
be suitably differentiable. 

10 1 should emphasize that there is no corresponding transformation in Lagrangian mechanics, 
which operates in the state space defined by the 2n -dimensional vector (q, q). Since q is defined as 
the time derivative of q, there is no analog to (13.47) in which the q’s and q 's get intermingled. 
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Figure 13.4 One can imagine two orbits passing through the same 
point z 0 in phase space. However, Hamilton’s equations guarantee that 
for any given point z G , there is a unique orbit passing through z 0 , so the 
two orbits must in fact be the same. 


described in Section 12.8, for viewing phase-space orbits in a subspace of fewer di¬ 
mensions than the full phase space, but here I shall give just two examples of systems 
for which n = 1, and the phase space is therefore only two-dimensional. 

Before we look at these examples, there is an important property of phase-space 
orbits that deserves mention right away: It is easy to see that no two different phase- 
space orbits can pass through the same point in phase space; that is, no two phase-space 
orbits can cross one another. For suppose two orbits pass through one and the same 
point z 0 , as in Figure 13.4. 

Now, from Hamilton’s equations (13.45), it follows that for any point z 0 there 
can be only one distinct orbit passing through z 0 . Therefore the two orbits passing 
through z 0 have to be the same. Notice that this result excludes different orbits from 
passing through the same point even at different times : If one orbit passes through z 0 
today, then no different orbit can pass through z 0 today, yesterday, or tomorrow. 11 This 
result — that no two phase-space orbits can cross — places severe restrictions on the 
way in which these orbits are traced out in phase space. It has important consequences 
in, for example, the analysis of chaotic motion of Hamiltonian systems. 


example 13.5 A One-Dimensional Harmonic Oscillator 

Set up Hamilton’s equations for a one-dimensional simple-harmonic oscillator 
with mass m and force constant k, and describe the possible orbits in the phase 
space defined by the coordinates (x, p). 

The kinetic energy is T = \mx 2 and the potential energy is U = \kx 2 = 
±moj 2 x 2 , if we introduce the natural frequency co = ffk/m. The generalized 
momentum is p = dT/dx = mx, and the Hamiltonian (written as a function of 
x and p ) is 

Jf = r + C/ = ^ + -mco 2 x 2 . (13.48) 

2m 2 


11 This is because our Hamiltonian is time-independent. If 'K is explicitly time-dependent, then 
we can assert only that no two orbits can pass through one point at the same time. 
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Thus, Hamilton’s equations give 

. 95f p 

x = — = — ai 

dp m 

The simplest way to solve these two equations is to eliminate p and get the 
familiar second-order equation x = — co 2 x, with the equally familiar solution 

x = A cos (cot - 8) and hence p = mx = —mcoA sin(mt — 5). (13.49) 


Phase space for the one-dimensional oscillator is the two-dimensional space 
with coordinates (x, p). In this space, the solution (13.49) is the parametric form 
for an ellipse, traced in the clockwise direction, as in Figure 13.5, which shows 
two phase-space orbits for the cases that the oscillator started out from rest at 
x = A (solid curve) and x = A/2 (dashed curve). That the orbits have to be 
ellipses follows from conservation of energy: The total energy is given by the 
Hamiltonian (13.48), whose initial value (for the solid curve with x = A and 
p = 0) is \_moj 2 A 2 . Thus conservation of energy implies that 


P 2 1 9 9 1 2,2 

-1 —mto x = —moo A 

2m 2 2 


or 


A 2 ( mcoA ) 2 


(13.50) 


This is the equation for an ellipse with semimajor and semiminor axes A and 
mcoA, in agreement with (13.49). 

It is perhaps worth following one of the phase-space orbits of Figure 13.5 in 
detail. For the solid curve, the motion starts from rest with x at its maximum, at 
the point x = A, p = 0, shown as a dot in the figure. The restoring force causes 
m to accelerate back toward x = 0, so that x gets smaller while p becomes 



Figure 13.5 The phase space for a one-dimensional harmonic oscillator is 
a plane, with axes labelled by x (the position) and p (the momentum), in 
which the point representing the state of motion traces a clockwise ellipse. 
There is a unique orbit through each phase point (x, p). The outer orbit 
(solid curve) started from rest with x = A and p = 0; the inner one started 
from x = A/2 and p = 0. 
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increasingly negative. By the time x has reached 0, p has reached its largest 
negative value p = —moo A. The oscillator now overshoots the equilibrium point, 
so x becomes increasingly negative, while p is still negative but getting less so. 
By the time x reaches —A, p is again zero, and so on, until the oscillator gets 
back to its starting point x — A, p = 0 and the cycle starts over again. Notice 
that, in agreement with our general argument, the two orbits shown in Figure 
13.5 do not cross one another. In fact, it is easy to see that no two ellipses with 
the form (13.50) can have any point in common unless they have the same value 
of A (in which case they are the same ellipse). 

In this simple one-dimensional case, the phase-space plot doesn’t tell us 
anything we couldn’t learn from the simple solution x = Acos(cot — 8), but 
I hope you will agree that it does show some details of the motion rather clearly. 


If you have read Chapter 12, you will recognize that a phase-space orbit is closely 
related to the state-space orbit described in Section 12.7. The only difference is that 
the former traces the system’s evolution in the (x, p) plane whereas the latter uses 
the (x,x) plane. In the present case there is almost no difference, since for one¬ 
dimensional motion along the x axis, p is proportional to x (specifically, p — mi). 
Thus the two kinds of plot are identical except that the former is stretched by a factor 
of m in the vertical direction. Nevertheless, in the general 2n-dimensional case, the 
spaces defined by (q, p) and by (q, q) can be very different. As we shall see in the next 
section, the phase-space plot has some elegant properties not shared by the state-space 
one. 

It often happens that one needs to follow not just one orbit but several different 
orbits through phase space. For example, in the study of chaos we saw that it is of 
great interest to follow the evolution of two identical systems that are launched with 
slightly different initial conditions. If the motion is nonchaotic, then the two systems 
remain close together in phase space, but if the motion is chaotic they move apart so 
rapidly that their detailed motion is effectively unpredictable. In the next example, we 
look at four neighboring phase-space orbits of a particle falling under the influence 
of gravity. 


example 13.6 A Falling Body 

Set up the Hamiltonian formalism for a mass m constrained to move in a 
vertical line, subject only to the force of gravity. Use the coordinate x, measured 
downward from a convenient origin, and its conjugate momentum. Describe the 
phase-space orbits and in particular sketch the orbits from time 0 to a later time 
t for the following four different initial conditions at t — 0: 

(a) x 0 — p 0 — 0 (that is, the mass is released from rest at x = 0); 

(b) x 0 = X, but p o — 0 (the mass is released from rest at x = X); 

(c) x 0 — X and p 0 = P (the mass is thrown from x = X with initial 
momentum p = P); 
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(d) x 0 = 0 and p Q — P (the mass is thrown from x = 0 with initial momentum 
P=P). 

The kinetic and potential energies are T = |mi 2 and U = —mgx. (Remem¬ 
ber that x is measured downward.) The conjugate momentum is, of course, 
p = mx and the Hamiltonian is 

H = T + U = — -mgx. (13.51) 

2m 

To find the shape of the phase-space orbits, we don’t have to solve the equations 
of motion, since conservation of energy requires that they satisfy “K = const. 
For the Hamiltonian (13.51), this defines a parabola with the form x = kp 2 + 
const, with its symmetry axis on the x axis. 

To draw the four orbits asked for, it is helpful to solve the equations of motion 


d£C _p_ 
dp m 


and 


The second of these gives 


and the first then gives 


P = Po + ™gt 


x = x 0 + —t + ^-gt 2 
m 2 


— both results that are very familiar from elementary mechanics. Putting in 
the given initial conditions, one gets the four curves shown in Figure 13.6. 
As expected, the four orbits A 0 A, • • •, D 0 D are parabolas, no one of which 
crosses any other. You can see that the initial rectangle A 0 B 0 C 0 D 0 has evolved 



Figure 13.6 Four different phase-space orbits for a body moving vertically 
under the influence of gravity, with position x (measured vertically down) 
and momentum p. The four different initial states at time 0 are shown by the 
dots labeled A 0 , B Q , C Q , and D 0 , with corresponding final states at a later 
time t labeled A, B,C, and D. 
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into a parallelogram ABCD. However, it is easy to show that the area of 
the parallelogram is the same as that of the original rectangle. (Same base, 
A 0 B 0 = AB, and same height. See Problem 13.27.) We shall see in the next 
section that these two properties (changing shape but unchanging area) are 
general properties of all phase-space orbits. In particular, that the area does not 
change is an example of the important result known as Liouville’s theorem. 


13.7 Liouville’s Theorem* 


* This section contains material that is more advanced than most other material in this book. 
In particular, it uses the divergence operator and divergence theorem of vector calculus. If 
you haven’t met these ideas before, you could, if you wish, omit this section. 

In Example 13.6 we saw that one can use a plot of phase space to track the motion of 
several identical systems evolving from various different initial conditions. In many 
problems, especially those of statistical mechanics, one has occasion to track the 
motion of an enormous number of identical systems. The state of each system can 
be labelled by a dot in phase space, and, if the number of these dots is large enough, 
we can view the resulting swarm of dots as a kind of fluid, with a density p measured 
in dots per volume of phase space. For example, in the statistical mechanics of an ideal 
gas, one wants to follow the motion of some 10 23 identical molecules as they move 
inside a container. Each molecule is governed by the same Hamiltonian and moves 
in the same six-dimensional phase space with coordinates (x, y, z, p x , p y , p z ). Thus 
the state of the system at any one time can be specified by giving the positions of 10 23 
dots in this phase space; these 10 23 dots form a swarm whose motion can (for many 
purposes) be treated like that of a fluid. For most of this section, you could bear this 
example in mind, though in specific cases I shall often specialize to a system with just 
one spatial dimension and hence two dimensions of phase space. 

The tracking of the motion of many identical systems by means of a cloud of dots 
in their phase space is illustrated in Figure 13.7. This picture could be viewed as a 
schematic representation of a multidimensional phase space, but let us, for simplicity, 
think of it as the two-dimensional phase space, with the coordinates z = (q, p) of 
a system with one degree of freedom. Each dot in the lower cloud represents the 
initial state of one system (for example, one molecule in a gas) by giving its position 
z = (q, p) in phase space at time t 0 . Hamilton’s equations determine the “velocity” 
with which each dot moves through phase space: 

,, (d'K a?c\ 

(phase-space velocity) = z = (q, p) = I -,-I . (13.52) 

V dp dq ) 

For each dot z in the initial cloud, there is a unique velocity z, and each dot moves 
off with its assigned velocity. In general, different dots will have different velocities, 
and the cloud can change its shape and orientation, as shown. On the other hand, as 
we shall prove shortly, the volume occupied by the cloud cannot change — the result 
called Liouville’s theorem. 
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Figure 13.7 The dots in the lower cloud label the states of some 300 
identical systems at time t 0 . As time progresses each dot moves through 
phase space in accordance with Hamilton’s equations, and the whole 
cloud moves to the upper position. 


To make this last idea more precise, we need to consider a closed surface in phase 
space such as that shown at the lower left of Figure 13.8. Each point on this closed 
surface defines a unique set of initial conditions and moves along the corresponding 
phase-space orbit as time advances. Thus the whole surface moves through phase 
space. It is easy to see that any point that is initially inside the surface must remain 
inside for all time: For suppose that such a point could move outside. Then at a certain 
moment it would have to cross the moving surface. At this moment we would have 
two distinct orbits crossing one another, which we know is impossible. By the same 
argument, any point that is initially outside the surface remains outside for all time. 
Thus the number of dots (representing molecules, for example) inside the surface is a 
constant in time. The main result of this section — Liouville’s theorem — is that the 
volume of the moving closed surface in Figure 13.8 is constant in time. To prove this, 
we need to know the relationship between the rate of change of this kind of volume 
and the so-called divergence of the velocity vector. 


i P 



Figure 13.8 As time progresses, the closed surface at the lower left 
moves through phase space. Any point that is initially inside the surface 
remains inside for all time. 


Changing Volumes 

The two mathematical results that we need hold in spaces with any number of dimen¬ 
sions. We shall need to apply them in the 2n -dimensional phase space. Nevertheless, 
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Figure 1 3.9 (a) The surface S moves with the fluid from its initial position 

(solid curve) at time t to a new position (dashed curve) at time t + St. The 
change in V is just the volume between the two surfaces, (b) Enlarged view 
of the shaded volume of part (a). The vector n is a unit vector normal to 
the surface S, pointing outward. 


let us consider them first in the familiar context of our everyday three-dimensional 
space. We imagine a three-dimensional space filled with a moving fluid. The fluid at 
each point r is moving with velocity v. For each position r there is a unique velocity 
v, but v can, of course, have different values at different points r; thus, we can write 
v = v(r). This is exactly analogous to the situation in phase space, where each phase 
point z is moving with a velocity that is uniquely determined by its position in phase 
space, z = z(z). 

Let us now consider a closed surface S in the fluid at a certain time t. We can 
imagine marking this surface with dye so that we can follow its motion as it moves with 
the fluid. The question we have to ask is this: If V denotes the volume contained inside 
S, how fast does V change as the fluid moves? Figure 13.9(a) shows two successive 
positions of the surface S at two successive moments a short time St apart. The change 
in V during the interval St is the volume between these two surfaces. To evaluate this, 
consider first the contribution of the small shaded volume in Figure 13.9(a), which is 
enlarged in Figure 13.9(b). This small volume is a cylinder whose base has an area 
dA. The side of the cylinder is given by the displacement v St, so the cylinder’s height 
is the component of v St normal to the surface. If we introduce a unit vector n in the 
direction of the outward normal to S, then the height of our cylinder is n • v St and its 
volume is n • v St dA. The total change in the volume inside our surface S is found by 
adding up all these small contributions to give 

8V = jn-yStdA, (13.53) 

where the integral is a surface integral running over the whole of our closed surface S. 
Dividing both sides by St and letting 8t -> 0, we get the first of our two key results: 

— = f ri'XdA. (13.54) 

dt Js 

Figure 13.9(a) showed a fluid flow with the velocity v everywhere outward, as 
would be the case for the expanding air in a balloon whose temperature is rising. With 
v outward, the scalar product n • v is positive (since n was defined as the outward 
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normal) and the integral (13.54) is positive, implying that V is increasing, as it should. 
If v were everywhere inward (air in a balloon whose temperature is falling), then n • v 
would be negative and V would be decreasing. In general, n • v can be positive on parts 
of the surface S and negative on others. In particular, if the fluid is incompressible, so 
that V can’t change, then the contributions from positive and negative values of n • v 
must exactly cancel so that dV/dt can be zero. 

We have derived the result (13.54) for a surface S and volume V in three di¬ 
mensions, but it is equally valid in any number of dimensions. In an m-dimensional 
space both of the vectors n and v have m components and their scalar product is 
n • v = n l v l + • • • + n m v m . With this definition, the result (13.54) is valid whatever 
the value of m. 


The Divergence Theorem 

The second mathematical result that we need is called the divergence theorem or 
Gauss’s theorem. This is one of the standard results of vector calculus (similar to 
Stokes’s theorem that we used in Chapter 4) and you can find its proof in any text on 
vector calculus, 12 although its proof is quite straightforward and instructive in various 
simple cases. (See Problem 13.37.) The theorem involves the vector operator called 
the divergence. For any vector v, the divergence of v is defined as 


V 


3n x dv y dv z _ 

dx 3 y 3z 


(13.55) 


here v can be any vector (a force or an electric field, for instance), but for us it will 
always be a velocity. If you have not met the divergence operator before, you might 
like to practice with some of Problems 13.31,13.32, or 13.34. 

The divergence theorem asserts that the surface integral in (13.54) can be expressed 
in terms of V • v: 


J n -xdA = J V-\dV. (13.56) 

Here the integral on the right is a volume integral over the volume V interior to 
the surface S. This theorem is an amazingly powerful tool. It plays a crucial role 
in applications of Gauss’s law of electrostatics. It often allows us to perform integrals 
that would otherwise be hard to evaluate. In particular, there are many interesting fluid 
flows with the property that V • v = 0; for such flows, the integral on the left may be 
very awkward to evaluate directly, but thanks to the divergence theorem we can see 
immediately that the integral is in fact zero. Since this is how we shall be using the 
divergence theorem, let us look at an example of such a flow right away. 


12 See, for example, Mary Boas, Mathematical Methods in the Physical Sciences (Wiley, 1983), 
p. 271. 




Section 13.7 Liouville’s Theorem * 


547 


EXAMPLE 13.7 A Shearing Flow 

The velocity of flow of a certain fluid is 

v = kyx (13.57) 

where k is a constant; that is, v x = ky and v y = v z = 0. Describe this flow and 
sketch the motion of a closed surface that starts out as a sphere. Evaluate the 
divergence V • v and show that the volume enclosed by any closed surface S 
moving with the fluid cannot change; that is, the fluid flows incompressibly. 

The velocity v is everywhere in the x direction and depends only on y. Thus 
all the points in any plane y = constant move like a sliding rigid plate — a pattern 
called laminar flow. 13 The speed increases with y, so each plane moves a little 
faster than the planes below it, in a shearing motion, as indicated by the three 
thin arrows in Figure 13.10. If we consider a closed surface moving with the fluid 
and initially spherical, then its top will be dragged along a little faster than its 
bottom and it will be stretched into an ellipsoid of ever-increasing eccentricity 
as shown. 

We can easily evaluate V • v using the definition (13.55) 


V • v = 


^ + ^ + S Jk = Bl + 0 + 0 = 0 . 

3x 3 y dz dx 


(13.58) 


Now, combining our two results (13.54) and (13.56), we see that the rate of 
change of the volume inside any closed surface is 


V-vdV. (13,59) 



Figure13.10 Thefluidflowdescribedby (13.57)isalaminar, shearing 
flow. The planes parallel to the plane y = 0 all move rigidly in the x 
direction with speed proportional to y. This shearing motion stretches 
the sphere into an ellipsoid. 

13 From the latin lamina meaning a thin layer or plate, from which we also get the verb “to 
laminate.” 
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j Since we have shown that V • v = 0 everywhere, it follows that dV/dt = 0 and 
j the volume inside any closed surface moving with the fluid is constant for the 
! flow of (13.57). 

If you have never met the divergence theorem before, you might want to try one or 
both of the Problems 13.33 and 13.37. Here let me say just one more thing about the 
significance of V • v. If we apply (13.59) to a sufficiently small volume V, then V • v 
will be approximately constant throughout the region of integration, and the right side 
of (13.59) becomes just (V • v) V and (13.59) itself implies that 


for any small volume V . Since dV Jdt can be called the outward flow of v, we can say 
that V • v is the outward flow per volume. If V • v is positive at a point r, then there is 
an outward flow around r and any small volume around r is expanding (like the gas in 
a balloon that is heating up); if V • v is negative, then there is an inward flow and any 
small volume around r is contracting (like the gas in a balloon that is cooling down). 
In our example, V • v was zero, and any volume moving with the fluid was constant. 

The divergence generalizes easily to any number of dimensions. In an Tri¬ 
dimensional space with coordinates (x b • • •, x m ) the divergence of a vector v = 
, v m ) is defined as 


V 


3ui dv m 

— + ••• + — 
3xj dx m 


and, with these definitions, the crucial result (13.59) takes exactly the same form, 
except that the integral is an integral over an m -dimensional region with volume 
element dV — dx x dx 2 ■ • • dx m . 


Liouville’s Theorem 

We are finally ready to prove the main goal of this section — Liouville’s theorem. This 
is a theorem about motion in phase space, the 2n-dimensional space with coordinates 
z = (q, p) = (q h ■ ■ ■, q n , p h • • •, p n ). To simplify the notation, I shall consider just 
a system with one degree of freedom, so that n = 1 and the phase space is just two- 
dimensional with phase points z = (q, p). The general case goes through in almost 
exactly the same way, as you can check for yourself by doing Problem 13.36. 

Each phase point z = (q, p) moves through the 2-dimensional phase space in 
accordance with Hamilton’s equations, with velocity 

. .. .. /35C 3IK\ 

We consider an arbitrary closed surface S moving through phase space with the phase 
points, as was illustrated in Figure 13.8. The rate at which the volume inside S changes 
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is given by (13.59), where now the volume is a two-dimensional volume (really an 
area) and 


dq dp dq \ dp J dp \ dq J 


(13.61) 


which is zero because the order of the two differentiations in the double derivatives is 
immaterial. Since V • v = 0, it follows that dV/dt = 0, and we have proved that the 
volume V enclosed by any closed surface is a constant as the surface moves around 
in phase space. This is Liouville’s theorem. 

There is another way to state Liouville’s theorem: We saw before that the number 
N of dots, representing identical systems, inside any given volume V cannot change. 
(No dot can cross the boundary S from the inside to the outside or vice versa.) We have 
now seen that the volume V cannot change. Therefore, the density of dots, p = N/V 
cannot change either. This statement is sometimes paraphrased to say that the cloud of 
dots moves through phase space like an incompressible fluid. However, it is important 
to be aware what this statement means. The density p can, of course, be different at 
different phase points z = (q, p)\ all we are claiming is that as we follow a phase 
point along its orbit, the density at this point does not change. 

Unfortunately we cannot pursue here the consequences of Liouville’s theorem, but 
I can just mention one example: We saw in Chapter 12 that when motion is chaotic, two 
identical systems that start out with nearly identical initial conditions move rapidly 
apart in phase space. Thus if we consider a small initial volume in phase space, such as 
is shown in Figure 13.8, and if the motion is chaotic, then at least some pairs of points 
inside the volume must move rapidly apart. But we have now seen that the total volume 
V cannot change. Therefore, as the volume grows in one direction, it must contract 
in another direction, becoming something like a cigar. Now it frequently happens that 
the region in which the phase points can move is bounded. (For example, conservation 
of energy has this effect for the harmonic oscillator of Figure 13.5.) In this case, as 
the volume V gets longer and thinner, it has to become intricately folded in on itself, 
adding another twist to the already fascinating story of chaotic motion. 

Finally, a couple of points on the validity of Liouville’s theorem: First, in all of the 
examples of this chapter, we have assumed that the Hamiltonian is not explicitly de¬ 
pendent on time, 3IK /dt = 0, and that the forces are conservative and the coordinates 
“natural,” so that “K = T + U. However, none of these assumptions is necessary for 
the truth of Liouville’s theorem. The proof given in this section depends only on the 
validity of Hamilton’s equations, and any system which obeys Hamilton’s equations 
also obeys Liouville’s theorem. For example, a charged particle in an electromagnetic 
field obeys Hamilton’s equations (Problem 13.18) and hence also Liouville’s theorem, 
even though ddi/dt may be nonzero and I K is certainly not equal to T + U. Second, 
Liouville’s theorem applies to the Hamiltonian phase space with coordinates (q, p), 
and there is no corresponding theorem for the Lagrangian state space with coordi¬ 
nates (q, q). This is one of the most important advantages of the Hamiltonian over 
the Lagrangian approach. 
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Principal Definitions and Equations of Chapter 13 _ 

The Hamiltonian 

If a system has generalized coordinates q = (q lt • • •, q n ), Lagrangian £, and gener¬ 
alized momenta p t = d£j/dq t , its Hamiltonian is defined as 

. ‘K=jr Pi q i -il, [Eq. (13.22)] 

always considered as a function of the variables q and p (and possibly t). 

Hamilton’s Equations 

The time evolution of a system is given by Hamilton’s equations 

C\<l_r r\rir 

% = — and pm-— [fm !,•••,«]• [Eq. (13.25)] 
3 Pi 3 q t 


Phase Space and Phase-Space Orbits 

The phase space of a system is the 2n-dimensional space with points (q, p) defined 
by the n generalized coordinates q t and the n corresponding momenta p,. A phase- 
space orbit is the path traced in phase space by a system as time evolves. 

[Sections 13.5 & 13.6] 


Liouville’s Theorem 

If we imagine a large number of identical systems launched at the same time with 
slightly different initial conditions, the phase-space points that represent the systems 
can be seen as forming a fluid. Liouville’s theorem states that the density of this fluid 
is constant in time (or, equivalently, that the volume occupied by any group of points 
is constant). [Section 13.7] 


Problems for Chapter 13 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (*★★). 

section 13.2 Hamilton’s Equations for One-Dimensional Systems 

13.1 * Find the Lagrangian, the generalized momentum, and the Hamiltonian for a free particle (no 
forces at all) confined to move along the x axis. (Use x as your generalized coordinate.) Find and solve 
Hamilton’s equations. 

13.2 * Consider a mass m constrained to move in a vertical line under the influence of gravity. Using 
the coordinate x measured vertically down from a convenient origin O, write down the Lagrangian £ 
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and find the generalized momentum p = dL/dx. Find the Hamiltonian M as a function of x and p, 
and write down Hamilton’s equations of motion. (It is too much to hope with a system this simple that 
you would learn anything new by using the Hamiltonian approach, but do check that the equations of 
motion make sense.) 

13.3 * Consider the Atwood machine of Figure 13.2, but suppose that the pulley is a uniform disc 
of mass M and radius R. Using x as your generalized coordinate, write down the Lagrangian, the 
generalized momentum p, and the Hamiltonian TC = px — C. Find Hamilton’s equations and use them 
to find the acceleration x. 

13.4 ★ The Hamiltonian TC is always given by TC = pq — L (in one dimension), and this is the form 
you should use if in doubt. However, if your generalized coordinate q is “natural” (relation between q 
and the underlying Cartesian coordinates is independent of time) then “K = T + U, and this form is 
almost always easier to write down. Therefore, in solving any problem you should quickly check to 
see if the generalized coordinate is “natural,” and if it is you can use the simpler form K = T + U. For 
the Atwood machine of Example 13.2 (page 527), check that the generalized coordinate was “natural.” 
[Hint: There are one generalized coordinate x and two underlying Cartesian coordinates x and y. You 
have only to write equations for the two Cartesians in terms of the one generalized coordinate and 
check that they don’t involve the time, so it’s safe to use = T + U. This is ridiculously easy!] 

13.5 ★* A bead of mass m is threaded on a frictionless wire that is bent into a helix with cylindrical polar 
coordinates (p, 0, z) satisfying z = c0 and p = R, with c and R constants. The z axis points vertically 
up and gravity vertically down. Using 0 as your generalized coordinate, write down the kinetic and 
potential energies, and hence the Hamiltonian !K as a function of 0 and its conjugate momentum p. 
Write down Hamilton’s equations and solve for 0 and hence z. Explain your result in terms of Newtonian 
mechanics and discuss the special case that R = 0. 

13.6 ** In discussing the oscillation of a cart on the end of a spring, we almost always ignore the mass 
of the spring. Set up the Hamiltonian J~C for a cart of mass m on a spring (force constant k ) whose mass 
M is not negligible, using the extension x of the spring as the generalized coordinate. Solve Hamilton’s 
equations and show that the mass oscillates with angular frequency co = ^k/(m + M/3). That is, the 
effect of the spring’s mass is to add M/3 to m. (Assume that the spring’s mass is distributed uniformly 
and that it stretches uniformly.) 

13.7 **★ A roller coaster of mass m moves along a frictionless track that lies in the xy plane (x 
horizontal and y vertically up). The height of the track above the ground is given by y = h(x). (a) Using 
x as your generalized coordinate, write down the Lagrangian, the generalized momentum p, and the 
Hamiltonian = px — £ (as a function of x and p). (b) Find Hamilton’s equations and show that 
they agree with what you would get from the Newtonian approach. [Hint: You know from Section 4.7 
that Newton’s second law takes the form F tang = ms, where 5 is the distance measured along the track. 
Rewrite this as an equation for x and show that you get the same result from Hamilton’s equations.] 

section 13.3 Hamilton’s Equations in Several Dimensions 

13.8 * Find the Lagrangian, the generalized momenta, and the Hamiltonian for a free particle (no 
forces at all) moving in three dimensions. (Use x, y, z as your generalized coordinates.) Find and solve 
Hamilton’s equations. 

13.9 ★ Set up the Hamiltonian and Hamilton’s equations for a projectile of mass m, moving in a vertical 
plane and subject to gravity but no air resistance. Use as your coordinates x measured horizontally and 
y measured vertically up. Comment on each of the four equations of motion. 
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13.10 ★ Consider a particle of mass m moving in two dimensions, subject to a force F = —kxx + K y, 
where k and K are positive constants. Write down the Hamiltonian and Hamilton’s equations, using x 
and y as generalized coordinates. Solve the latter and describe the motion. 

13.11 * The simple form Di = T + U is true only if your generalized coordinates are “natural” (relation 
betweeen generalized and underlying Cartesian coordinates is independent of time). If the generalized 
coordinates are not “natural,” you must use the definition TC = PiQi — C. To illustrate this point, 
consider the following: Two children are playing catch inside a railroad car that is moving with varying 
speed V along a straight horizontal track. For generalized coordinates you can use the position (x,y,z) 
of the ball relative to a point fixed in the car, but in setting up the Hamiltonian you must use coordinates 
in an inertial frame — a frame fixed to the ground. Find the Hamiltonian for the ball and show that it 
is not equal to T + U (neither as measured in the car, nor as measured in the ground-based frame). 

13.12* Same as Problem 13.11, but use the following system: A bead of mass m is threaded on a 
frictionless, straight rod, which lies in a horizontal plane and is forced to spin with constant angular 
velocity co about a vertical axis through the midpoint of the rod. Find the Hamiltonian for the bead and 
show that it is not equal to T + U. 

13.13 *★ Consider a particle of mass m constrained to move on a frictionless cylinder of radius R, 
given by the equation p = R in cylindrical polar coordinates (p, 0, z). The mass is subject to just one 
external force, F = — krr, where k is a positive constant, r is its distance from the origin, and f is the 
unit vector pointing away from the origin, as usual. Using z and 0 as generalized coordinates, find the 
Hamiltonian TC. Write down and solve Hamilton’s equations and describe the motion. 

13.14 *★ Consider the mass confined to the surface of a cone described in Example 13.4 (page 533). We 
saw there that there have to be maximum and minimum heights z max and z min , beyond which the mass 
cannot stray. When z is a maximum or minimum, it must be that z = 0. Show that this can happen if 
and only if the conjugate momentum p z = 0, and use the equation TC = E, where “K is the Hamiltonian 
function (13.33), to show that, for a given energy E, this occurs at exactly two values of z. [Hint: Write 
down the function (K for the case that p z = 0 and sketch its behavior as a function of z for 0 < z < oo. 
How many times can this function equal any given El] Use your sketch to describe the motion of the 
mass. 

13.15 *★ Fill in the details of the derivation of Hamilton’s 2 n equations (13.25) for a system with n 
degrees of freedom, starting from Equation (13.24). You can parallel the argument that led from (13.14) 
to (13.15) and (13.16), but you have 2 n different derivatives to consider and lots of summations from 
i = 1 to n to contend with. 

13.16** Starting from the expression (13.24) for the Hamiltonian, prove that d!K/dt = —dL/dt. 
[Hint: Consider first a system with one degree of freedom, for which (13.24) simplifies to Jf(g, p, t) = 
pq(q, p,t) — L ( q,q(q , p,t),t).] 

13.17 *** Consider the mass confined to the surface of a cone described in Example 13.4 (page 533). 
We saw that there are solutions for which the mass remains at the fixed height z = z 0 , with fixed 
angular velocity 0 O say. (a) For any chosen value of p^, use (13.34) to get an equation that gives the 
corresponding value of the height z Q . (b) Use the equations of motion to show that this motion is stable. 
That is, show that if the orbit has z = z 0 + e with e small, then e will oscillate about zero, (c) Show 
that the angular frequency of these oscillations is co = V30 o sin a, where a is the half angle of the 
cone (tan a = c where c is the constant in p = cz). (d) Find the angle a for which the frequency of 
oscillation co is equal to the orbital angular velocity 0 O , and describe the motion for this case. 
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13.18 *** All of the examples in this chapter and all of the problems (except this one) treat forces 
that come from a potential energy U (r) [or occasionally U (r, r)]. However, the proof of Hamilton’s 
equations given in Section 13.3 applies to any system for which Lagrange’s equations hold, and this 
can include forces not derivable from a potential energy. An important example of such a force is the 
magnetic force on a charged particle, (a) Use the Lagrangian (7.103) to show that the Hamiltonian for 
a charge q in an electromagnetic field is 

= (p — qA) 2 /(2m) + qV. 

(This Hamiltonian plays an important role in the quantum mechanics of charged particles.) (b) Show 
that Hamilton’s equations are equivalent to the familiar Lorentz force equation mf = <?(E + v x B). 

section 13.4 Ignorable Coordinates 

13.19* In Example 13.3 (page 531) we saw that if we write the Hamiltonian for a two-dimensional 
central force problem in terms of polar coordinates r and 0, then the coordinate 0 is ignorable. Write 
down the Hamiltonian for the same problem, but using rectangular coordinates x,y. Show that, with this 
choice, neither coordinate is ignorable. [The moral of this is that the choice of generalized coordinates 
calls for some care. In. particular, you must look for any symmetries in a system and try to choose 
generalized coordinates to take advantage of them.] 

13.20 * Consider a mass m moving in two dimensions, subject to a single force F that is independent of 
r and t. (a) Find the potential energy U (r) and the Hamiltonian Oi. (b) Show that if you use rectangular 
coordinates x, y with the x axis in the direction of F, then y is ignorable. (c) Show that if you use 
rectangular coordinates x, y with neither axis in the direction of F, then neither coordinate is ignorable. 
(Moral: Choose generalized coordinates carefully!) 

13.21 ** Two masses mj and m 2 are joined by a massless spring (force constant k and natural length l Q ) 
and are confined to move in a frictionless horizontal plane, with CM and relative positions R and r as 
defined in Section 8.2. (a) Write down the Hamiltonian TC using as generalized coordinates X, Y, r, 0, 
where (X, Y ) are the rectangular components of R, and (r, 0) are the polar coordinates of r. Which 
coordinates are ignorable and which are not? Explain, (b) Write down the 8 Hamilton equations of 
motion, (c) Solve the r equations for the special case that p<p = 0 and describe the motion, (d) Describe 
the motion for the case that p^ / 0 and explain physically why the r equation is harder to solve in this 
case. 

13.22** In the Lagrangian formalism, a coordinate q t is ignorable if 3L/3 q t — 0; that is, if L is 
independent of q t . This guarantees that the momentum p t is constant. In the Hamiltonian approach, 
we say that q t is ignorable if !K is independent of q h and this too guarantees p L is constant. These 
two conditions must be the same, since the result “pj = const” is the same either way. Prove directly 
that this is so, as follows: (a) For a system with one degree of freedom, prove that d“K/dq = — 3L/3 q 
starting from the expression (13.14) for the Hamiltonian. This establishes that d'K/dq = 0 if and only 
if 3L/3 q = 0. (b) For a system with n degrees of freedom, prove that d‘K/dq i = —3L/3 q t starting 
from the expression (13.24). 

13.23 *** Consider the modified Atwood machine shown in Figure 13.11. The two weights on the left 
have equal masses m and are connected by a massless spring of force constant k. The weight on the right 
has mass M = 2m, and the pulley is massless and frictionless. The coordinate x is the extension of the 
spring from its equilibrium length; that is, the length of the spring is / e + x where / e is the equilibrium 
length (with all the weights in position and M held stationary), (a) Show that the total potential energy 
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(spring plus gravitational) is just U = \kx 2 (plus a constant that we can take to be zero), (b) Find the 
two momenta conjugate to x and y. Solve for x and y, and write down the Hamiltonian. Show that the 
coordinate y is ignorable. (c) Write down the four Hamilton equations and solve them for the following 
initial conditions: You hold the mass M fixed with the whole system in equilibrium and y = y 0 . Still 
holding M fixed, you pull the lower mass m down a distance x 0 , and at t = 0 you let go of both masses. 
[Hint: Write down the initial values of x, y and their momenta. You can solve the x equations by 
combining them into a second-order equation for x. Once you know x(t), you can quickly write down 
the other three variables.] Describe the motion. In particular, find the frequency with which x oscillates. 


section 13.5 Lagrange’s Equations vs. Hamilton’s Equations 

13.24 ★ Here is a simple example of a canonical transformation that illustrates how the Hamiltonian 
formalism lets one mix up the q’s and the p's. Consider a system with one degree of freedom and 
Hamiltonian K = K(q, p). The equations of motion are, of course, the usual Hamiltonian equations 
q = dK/dp and p = —dK/dq. Now consider new coordinates in phase space defined as Q = p and 
P = —q. Show that the equations of motion for the new coordinates Q and P are Q = dK/dP and 
P =. -dK/dQ\ that is, the Hamiltonian formalism applies equally to the new choice of coordinates 
where we have exchanged the roles of position and momentum. 

13.25 **★ Here is another example of a canonical transformation, which is still too simple to be of 
any real use, but does nevertheless illustrate the power of these changes of coordinates, (a) Consider 
a system with one degree of freedom and Hamiltonian K = K (q, p) and a new pair of coordinates Q 
and P defined so that 

• q = \flPsinQ and p = V2PcosQ. (13.62) 

Prove that if dK/dq = — p and d'K/dp = q, it automatically follows that dK/dQ = — P and 
dK/dP = Q. In other words, the Hamiltonian formalism applies just as well to the new coordinates as 
to the old. (b) Show that the Hamiltonian of a one-dimensional harmonic oscillator with mass m = 1 
and force constant k = 1 is K = \{q 2 + p 2 ). (c) Show that if you rewrite this Hamiltonian in terms of 
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the coordinates Q and P defined in (13.62), then Q is ignorable. [The change of coordinates (13.62) 
was cunningly chosen to produce this elegant result.] What is P? (d) Solve the Hamiltonian equation 
for Q(t) and verify that, when rewritten for q, your solution gives the expected behavior. 

section 13.6 Phase-Space Orbits 

13.26 * Find the Hamiltonian !K for a mass m confined to the x axis and subject to a force F x = —kx 3 
where k > 0. Sketch and describe the phase-space orbits. 

13.27 ** Figure 13.6 shows some phase-space orbits for a mass in free fall. The points A 0 , B 0 , C 0 , D 0 
represent four different possible initial states at time 0, and A, B,C, D are the corresponding states 
at a later time. Write down the position x{t) and momentum pit) as functions of t and use these to 
prove that ABCD is a parallelogram with area equal to the rectangle A 0 B o C o D o . [This is an example 
of Liouville’s theorem.] 

13.28 ** Consider a mass m confined to the x axis and subject to a force F x = kx where k > 0. (a) Write 
down and sketch the potential energy U(x) and describe the possible motions of the mass. (Distinguish 
between the cases that E > 0 and E < 0.) (b) Write down the Hamiltonian "K{x, p), and describe the 
possible phase-space orbits for the two cases E > 0 and E < 0. (Remember that the function TC(jc , p) 
must equal the constant energy E.) Explain your answers to part (b) in terms of those to part (a). 

section 13.7 Liouville’s Theorem* 

13.29* Figure 13.10 shows an initially spherical volume getting stretched into an ellipsoid by the 
shearing flow (13.57). Make a similar sketch for a volume that is initially spherical and centered on the 
origin. 

13.30 * Figure 13.9 shows a fluid flow where the flow is everywhere outward (at least for all points on 
the surface S shown). This means that all contributions to the change 8V in volume are positive, and 
V is definitely increasing. Sketch the corresponding picture for the case that the flow is outward on the 
upper part of S and inward on the lower part. Explain clearly why the contributions n • v 8t dA to the 
change in V from the lower part of S are negative, and hence that 8V can be of either sign, depending 
on whether the positive contributions outweigh the negative or vice versa. 

13.31 * Evaluate the three-dimensional divergence V • v for each of the following vectors: (a) v = kr, 
(b) v = k(z, x, y), (c) v = k(z, y, x), (d) v = k(x, y, —2z), where r = (x, y, z) is the usual position 
vector and k is a constant. 

13.32 ** Evaluate the three-dimensional divergence V • v for each of the following vectors: (a) v = kx, 
(b) v = kxx, (c) v = kyx. We know that V • v represents the net outward flow associated with v. In 
those cases where you found V • v = 0, make a simple sketch to illustrate that the outward flow is 
zero; in those cases where you found V • v ^ 0, make a sketch to show why and whether the outflow 
is positive or negative. 

13.33 ** The divergence theorem is a remarkable result, relating the surface integral that gives the flow 
of v out of a closed surface S to the volume integral of V • v. Occasionally it is easy to evaluate both of 
these integrals and one can check the validity of the theorem. More often, one of the integrals is much 
easier to evaluate than the other, and the divergence theorem then gives one a slick way to evaluate a 
hard integral. The following exercises illustrate both of these situations, (a) Let v = kr, where k is a 
constant and let S be a sphere of radius R centered on the origin. Evaluate the left side of the divergence 
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theorem (13.56) (the surface integral). Next calculate V • v and use this to evaluate the right side of 
(13.56) (the volume integral). Show that the two agree, (b) Now use the same velocity v, but let S be 
a sphere not centered on the origin. Explain why the surface integral is now hard to evaluate directly, 
but don’t actually do it. Instead, find its value by doing the volume integral. (This second route should 
be no harder than before.) 

13.34 ★★ (a) Evaluate V • v for v = kr/r 2 using rectangular coordinates. (Note that r/r 2 m r/r 3 .) 
(b) Inside the back cover, you will find expressions for the various vector operators (divergence, 
gradient, etc.) in polar coordinates. Use the expression for the divergence in spherical polar coordinates 
to confirm your answer to part (a). (Take r ^ 0.) 

13.35 ** A beam of particles is moving along an accelerator pipe in the z direction. The particles are 
uniformly distributed in a cylindrical volume of length L 0 (in the z direction) and radius R 0 . The 
particles have momenta uniformly distributed with p z in an interval p 0 ± A p z and the transverse 
momentum p ± inside a circle of radius A p ± . To increase the particles’ spatial density, the beam is 
focused by electric and magnetic fields, so that the radius shrinks to a smaller value R. What does 
Liouville’s theorem tell you about the spread in the transverse momentum p ± and the subsequent 
behavior of the radius R1 (Assume that the focusing does not affect either L 0 or A p z .) 

13.36 ★★ Prove Liouville’s theorem in the 2n -dimensional phase space of a system with n degrees 
of freedom. You can follow closely the argument around Equations (13.60) and (13.61). The only 
difference is that now the phase velocity v = z is a 2n -dimensional vector and V • v is a 2n-dimensional 
divergence. 

13.37 **★ The general proof of the divergence theorem 

I n -xdA = I V-xdV (13.63) 

Js Jv 

is fairly complicated and not especially illuminating. However, there are a few special cases where it is 
reasonably simple and quite instructive. Here is one: Consider a rectangular region bounded by the six 
planes x = X and X + A,y = Y and Y + B, and z = Z and Z + C, with total volume V = ABC. The 
surface S of this region is made up of six rectangles that we can call S ] (in the plane x = X), S 2 (in the 
plane x = X + A), and so on. The surface integral on the left of (13.63) is then the sum of six integrals, 
one over each of the rectangles S\, S 2 , and so forth, (a) Consider the first two of these integrals and 
show that 

p p pY+B pZ+C 

/ n -xdA+ I n •xdA=l dy I dz [v x (X + A, y, z) — v x (X, y, z) ]. 

Js ; Js 2 Jy Jz 


(b) Show that the integrand on the right can be rewritten as an integral of dv x /dx over x running from 
x = X to x = X + A. (c) Substitute the result of part (b) into part (a), and write down the corresponding 
results for the two remaining pairs of faces. Add these results to prove the divergence theorem (13.63). 



CHAPTER 


Collision Theory 


The collision experiment , or scattering experiment, is the single most powerful tool for 
investigating the structure of atomic and subatomic objects. In this type of experiment 
one fires a stream of projectiles, such as electrons or protons, at a target object — 
an atom or atomic nucleus, for example — and, by observing the distribution of 
“scattered” projectiles as they emerge from the collision, one can gain information 
about the target and its interactions with the projectile. Perhaps the most famous 
collision experiment was the discovery by Ernest Rutherford (1871-1937) of the 
structure of the atom: Rutherford and his assistants fired streams of a particles (the 
positively charged nuclei of helium atoms) at a thin layer of gold atoms in a sheet of 
gold foil; by measuring the distribution of the scattered a particles, they were able to 
deduce that most of the mass of an atom is concentrated in a tiny, positively charged 
“nucleus” at the center of the atom. Since that time, most discoveries in atomic and 
subatomic physics (the discoveries of the neutron, of nuclear fission and fusion, of 
quarks, and many more) were made with the help of collision experiments, in which 
a stream of projectiles were directed at a suitable target and the outgoing particles 
carefully monitored. 

You could imagine doing a scattering experiment with larger objects — scattering 
one billiard ball off another, or even a comet off the sun — but in these cases there are 
usually easier ways to find out about the target. Thus the main application of collision 
theory is at the atomic level and below. Since the correct mechanics for atomic and 
subatomic systems is quantum mechanics, this means that the most widely used form 
of collision theory is quantum collision theory. Nevertheless, many of the central 
ideas of quantum collision theory — total and differential scattering cross sections, 
lab and CM reference frames — already appear in the classical theory, which gives 
an excellent introduction to these ideas without the complications of quantum theory. 
This, then, is the main purpose of this chapter, to give an introduction to the main 
ideas of collision theory in the context of classical mechanics. 

The main reason why collision theory is a rather complicated structure is that 
on the atomic and subatomic scale one cannot possibly follow the detailed orbit of 
the projectile as it interacts with the target. As we shall see, this means that we can 
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learn very little by observing a single projectile. On the other hand, if we send in 
many projectiles, we can observe the number of them that get scattered in different 
directions, and we can learn lots from this. It is to handle the statistical distribution 
of the many scattered projectiles, in the many possible directions, that we have to 
introduce the idea of the collision cross section, which is the central concept of 
collision theory, both classical and quantum, 1 and is the main subject of this chapter. 


14.1 The Scattering Angle and Impact Parameter 


Before we introduce the central concept of the scattering cross section, it is helpful 
to introduce two other important parameters, the scattering angle and the impact 
parameter-, and before we do this, it is good to have a couple of simple collision 
experiments in mind. All collision experiments start with a projectile approaching a 
target from so far away that it moves essentially freely and its energy is purely kinetic. 
In Figure 14.1, a fixed target exerts a force on the projectile, and, as the latter gets close 
to the target, the orbit curves, so that the projectile is “scattered” and moves away in 
a different direction. An example of this sort of collision is the famous Rutherford 
experiment, in which the projectile and target were both positively charged particles 
and the force that caused the scattering was the Coulomb repulsion between them. In 
Figure 14.2, the target is a hard sphere, which exerts a force on the projectile only 
when they come into contact (a contact force)-, thus, the projectile travels in a straight 
line until it hits the target (if it does) and then bounces and moves off in a different 
direction. A familiar example of this kind of event is the collision of two billiard balls 
(though in this case the projectile and target would be the same size). 

With these two examples in mind, we can now define the first two key parameters of 
collision theory. The scattering angle 9 is defined as the angle between the incoming 
and outgoing velocities of the projectile, as indicated in both Figures 14.1 and 14.2. In 
the absence of any target, the scattering angle would, of course, be zero. Thus 9 = 0 
corresponds to no scattering; for example, in Figure 14.2 the projectile could miss the 
target entirely and then 9 would be zero. The maximum possible value is 9 = if, in 
Figure 14.2, a head-on collision, in which the projectile comes in along the target’s 
axis and bounces straight back, would give 9 = j r. 

The impact parameter b is defined as the perpendicular distance from the pro¬ 
jectile’s incoming straight-line path to a parallel axis through the target’s center, as 
shown on the left in both figures. A second way to think about the impact parameter 
is illustrated toward the right in Figure 14.1: You can think of b as the distance that 
would be the distance of closest approach if there were no forces on the projectile, 
so that the orbit was just a straight line. In other words, the impact parameter tells 


1 Quantum mechanics is an intrinsically statistical theory, dealing with probabilities rather than 
definite predictable outcomes. Thus in quantum mechanics the need to discuss the distribution of 
different scattered directions is there from the outset — in contrast with the situation in classical 
mechanics, where the same need arises from the practical impossibility of observing the detailed 
orbit of a projectile. Nevertheless, the machinery for handling the problem — most notably the idea 
of the collision cross section — is very similar in the two theories. 
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Figure 14.1 In this collision experiment the projectile approaches from 
the left moving like a free particle. When it begins to feel the force field 
of the target, its orbit curves and it moves away in a different direction. 
The scattering angle 9 is the angle between the initial and final velocities. 
The impact parameter b is the perpendicular distance from the incoming 
straight-line orbit to a parallel axis through the center of the target. 



Figure 14.2 A scattering experiment in which the projectile and target 
interact only through a contact force, so that the projectile is deflected 
only when it actually hits the target and bounces off it. A collison occurs 
only if the impact parameter b is less than the radius of the target. 


how closely the projectile was aimed at the target. If the impact parameter is very 
large, then the projectile will hardly feel the target, and 0 will be small. In fact, in 
Figure 14.2 ,6 would be precisely zero for any value of b greater than the radius of the 
target. At the other extreme, the value b = 0 implies a head-on collision and often 2 
corresponds to 6 = n. It is reasonably clear from Figures 14.1 and 14.2 that for a 
given value of b there will be a unique corresponding value of 0, and we shall find, in 
fact, that the main theoretical task in classical collision theory is to find the functional 
relation 6 = 0(b) between these two variables. 

In atomic and subatomic physics (where collision theory has its greatest applica¬ 
tion), the experimental status of the scattering angle is totally different from that of 


2 In both our examples (the Rutherford experiment and hard-sphere scattering) b = 0 certainly 
implies 6 = n. On the other hand, if the force between the projectile and target is attractive, then a 
projectile with b — 0 will crash into the target, and — depending on the nature of the target — may 
never re-emerge, or may plow straight through and emerge with 9 — 0. 
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Figure 14.3 (a) In a cloud chamber, any moving charged particle leaves a vis¬ 

ible track which can be photographed for later examination. Here four protons 
have entered the chamber from the left, three of them passing straight through 
undeflected. The fourth was deflected when it hit a nitrogen nucleus and left the 
chamber near the top right. The recoiling nucleus can be seen moving down and 
to the right, (b) Tracing of the tracks involved in the collision. 


the impact parameter: The scattering angle 0 is rather easily measured, whereas the 
impact parameter can never be measured directly. Let us address these two points in 
turn. There are many ways to measure the scattering angle, one of the most transparent 
of which is illustrated in Figure 14.3. This could represent a photograph made in a 
cloud chamber of a collision of a proton with a nitrogen nucleus. These particles are 
of course far too small to be photographed directly, but the cloud chamber is one of 
several devices that let one record the track of a moving charged particle — even when 
the particle is far too small to be seen itself. In the cloud chamber, a charged particle 
moving through a cloud of supersaturated water vapor ionizes some of the atoms that 
it passes, and these ionized atoms cause some of the water vapor to condense, creating 
a visible trail, somewhat like the vapor trail left in the sky by some aircraft. In Fig¬ 
ure 14.3 you can clearly see the trail of the incoming proton. The nucleus which the 
proton eventually strikes is initially invisible, since it is not moving. However, when 
the proton strikes the nucleus, the proton’s track makes an abrupt change of direction, 
and the nucleus recoils leaving its own track. From pictures like this, the scattering 
angle is easily measured. 

On the other hand, the impact parameter can never be directly measured. The 
problem is that impact parameters of interest are of atomic or subatomic size — around 
0.1 nanometers or less. A cloud-chamber track like that of Figure 14.3 has a width of 
order 1 millimeter, some 10 million times greater than the largest impact parameters 
of interest. Obviously direct and meaningful measurements of the impact parameter 
are out of the question. As we shall see in the next section, it is the impossibility 
of measuring the impact parameter that leads us to the notion of the collision cross 
section. 


14.2 The Collision Cross Section 


Let us imagine first that we were to observe a single collision like that shown in 
Figure 14.3. If we knew the impact parameter, then we could obviously deduce 
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something about the size of the target or the strength and range of the force that it 
exerts on the projectile. But given that we do not know the impact parameter, there 
is remarkably little we can learn from a single event like this; in fact, about the only 
thing we can conclude from this one event is that there is some kind of obstacle in 
the projectile’s path. On the other hand, if we can observe many different collisions 
of similar projectiles and targets, then we can begin to investigate the nature of the 
projectile and target and their interactions. To explore this important idea, I shall 
consider several simple examples. 

Consider first an experiment like that shown in Figure 14.2, where a projectile 
of negligible size impinges on a hard sphere of radius R, with which it interacts 
only on contact. If the projectile hits the target, it will bounce off it and emerge in 
a different direction; if the projectile misses the target, it will pass straight through 
undeflected. When observed experimentally, the collision illustrated in Figure 14.2 
will look something like Figure 14.3, and the observation of one such event tells us 
only that the target is there. 

Suppose however we could repeat the same experiment many times. In practice this 
is accomplished in two ways: Instead of having just one target, we have many targets 
in a single target assembly — for example, many gold nuclei in a gold foil or many 
helium atoms in a tank of helium gas. And, instead of firing in a single projectile, we 
fire in a whole beam of projectiles. To begin, let us consider a single projectile passing 
through an assembly of hard-sphere targets. As “seen” by the incoming projectile, the 
target assembly looks something like Figure 14.4. Since we don’t know the projec¬ 
tile’s precise line of approach, we can’t say whether it will hit one of the targets or not. 
However, we can calculate the probability that it will make a hit, as follows: If the tar¬ 
gets are randomly positioned and sufficiently numerous, 3 we can speak of the target 
density n tar as the number of targets per area, as viewed from the incident direction. 
If A is the total area of the target assembly, then the total number of targets is n tar A. 
Next, we denote by o = nR 2 the cross-sectional area, or just cross section of each tar¬ 
get (as seen from the front), so that the total area of all the targets is n tar Ao\ Therefore, 



cross section cr 


Figure 14.4 A target assembly with several hard-sphere targets, as 
seen head-on by the incoming projectile. The cross section o is the 
area of any one target, perpendicular to the incident direction. The 
target density n tar is the density (number/area) of targets as seen from 
the incident direction, as here. 


3 It is important to have lots of targets for statistical considerations to apply. On the other hand, 
the targets must not be too numerous, or some targets may get hidden in the “shadow” of others, 
and the projectile may make several collisions. It was to avoid multiple collisions that Rutherford 
used a thin foil in his famous experiment. 
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the probability that any one projectile makes a hit as it passes through the assembly 
on a random path is the ratio 

, , ,„ ... area occupied by targets n tar A<r 

(probability of a hit) =-=-= n tai a. (14.1) 

total area A 

If we now send in a beam containing a large number (call it A inc ) of incident projectiles, 
then the actual number of projectiles that get scattered (N sc ) should be the product of 
the probability (14.1) and N mc : 




(14.2) 


This is the basic relation of collision theory. Since we can measure the numbers N sc and 
N mc and the target density n tar , Equation (14.2) lets us find the size (or cross section) 
cr of the target. In what follows, we shall see that the notion of the cross section gets 
generalized considerably and can become quite complicated, but the essential idea is 
always the same: By counting the number of scatterings (or reactions, or absoptions, 
or other processes) that result from a large number of similar collisions, one can use 
the analog of (14.2) to find the parameter cr, which is always the effective area of the 
target for interacting with the projectile. 


example i4.i Shooting Crows in an Oak Tree 

A hunter observes 50 crows settling randomly in an oak tree, where he can no 
longer see them. Each crow has a cross-sectional area ft 2 , and the oak has 

a total area (as seen from the hunter’s position) of 150 square feet. If the hunter 
fires 60 bullets at random into the tree, about how many crows would he expect 
to hit? 

This situation closely parallels our simple scattering experiment. The target 
density is n tar = (number of crows)/(area of tree) = 50/150 = 1/3 ft -2 . The number 
of incident projectiles is A inc = 60, so, by the analog of (14.2), the expected 
number of hits is 

N hk = N im .n^o = 60 x (j fr 2 ) x (j ft 2 ) = 10. 


In practice, one often uses a steady stream of projectiles, and it may be more 
convenient to divide the incident number N inc by the time At, to give the incident rate 
R inc = N mc /At. Similarly, the scattered rate is R sc — N sc /At. Dividing both sides of 
(14.2) by At, we get the completely equivalent relation 

^sc= /? inc«tar°' 


for these rates. 

Since the cross section cr is an area, the SI unit of cross section is, of course, 
the square meter. Atomic and nuclear cross sections are inconveniently small when 
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measured in square meters. In particular, typical nuclear dimensions are around 
10 _14 m, so nuclear cross sections are conveniently measured in units of 10 _28 m 2 . 
This area has come to be called a barn (as in “it’s as big as a bam”), 

1 barn = l(T 28 m 2 . 


example 14.2 Scattering of Neutrons in an Aluminum Foil 

If 10,000 neutrons are fired through an aluminum foil 0.1 m m thick and the cross 
section of the aluminum nucleus is about 1.5 bams, 4 how many neutrons will 
be scattered? (Specific gravity of aluminum = 2.7.) 

The number of scatterings is given by (14.2), and we already know that 
iVjnc = 10 4 and <7 = 1.5 x 10 -28 m 2 . Thus all we need to find is the target density 
n tai , the number of aluminum nuclei per area of the foil. (Of course, the foil 
contains lots of atomic electrons as well, but these do not contribute appreciably 
to the scattering of neutrons.) The density of aluminum (mass/volume) is q = 
2.7 x 10 3 kg/m 3 . If we multiply this by the thickness of the foil (t = 10~ 4 m), 
this will give the mass per area of the foil, and dividing this by the mass of an 
aluminum nucleus (m = 27 atomic mass units), we will have n tai : 


gt _ (2.7 x 10 3 kg/m 3 ) x (10~ 4 m) 


: 6.0 x 10~ 4 m~ . (14.3) 


^ m 27 x 1.66 x 10 -27 kg 

Substituting into (14.2) we find for the number of scatterings 

N sc = N inc n tai o = (10 4 ) x (6.0 x 10 24 m“ 2 ) x (1.5 x 10~ 28 m 2 ) = 9. 

Here, we used the given cross section o to predict the number N sc of scatterings 
we should observe. Alternatively, we could have used the observed value of N sc 
to find the cross section a. 


14.3 Generalizations of the Cross Section 


The relation (14.2), with its many generalizations, is the fundamental relation of 
collision theory. Theorists calculate the cross section a using assumed models of the 
target, and experimenters then use (14.2) to measure a and compare with the predicted 
value. However, the projectile and target, and their interactions, are generally much 
more complicated than the simple point projectile and hard-sphere target that we used 
in the last section. In this section, we’ll look at a few slightly more interesting cases. 


4 As we shall see shortly, the cross section of a target can be different for different projectiles. 
Thus, strictly speaking, I should say that the cross section of aluminum for scattering neutrons is 
about 1.5 bams. Moreover, the cross section can depend on the energy of the projectiles; the number 
given here is valid for energies from about 0.1 eV to about 1000 eV. 
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Scattering of Two Hard Spheres 

Let us imagine a target that is a hard sphere of radius R 2 and a projectile that is 
another hard sphere of radius R h as shown in Figure 14.5. The two spheres make 
contact if and only if the impact parameter b is less than or equal to the sum of the 
two radii, b < R x + R 2 ; that is, the center of the projectile must lie inside a circle 
centered on the target with radius R^ + R 2 and area o = Jt(R [ + R 2 ) 2 . We can now go 
through exactly the same argument as in the last section, finding the probablility of any 
one projectile getting scattered, and thence the total number of projectiles scattered. 
The only difference is that the area of the target a = jtR 2 must now be replaced by 
a =n(R l + R 2 ) 2 . Thus we arrive at the same conclusion 

Wsc = Mnc«tarO- (14.4) 


except that now 

a = 7T (/?! + R 2 ) 2 - (14.5) 

The main moral of this example is that we can continue to use the usual relation 
(14.4), but you should no longer see a as the cross section of the target. Rather, o is 
a property of the target and projectile and should be thought of as the effective area 
of the former for scattering the latter. In particular, the cross section of a particular 
target for scattering one kind of projectile may be very different from that for the same 
target with a different projectile. 



Figure 14.5 A hard-sphere projectile of radius R x approaches a hard- 
sphere target of radius R 2 , with impact parameter b. A collision occurs 
only if b < R x + R 2 . 


example 14.3 Mean Free Path of an Air Molecule 

I 

The N 2 and 0 2 molecules in the air around us behave very much like hard spheres 
| of radius R ~ 0.15 nm. Use this to estimate the mean free path of an air molecule 
j at STP. 

The mean free path k of a molecule in a gas is an important parameter 
that determines several properties of the gas — the conductivity, viscosity, and 
j diffusion rates, for example. It is defined as the average distance that a molecule 
j travels between collisions with other molecules. To estimate this, let us follow 
j one chosen molecule as it moves through the gas, starting immediately after a 
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collision. To simplify our discussion, I shall assume that all the other molecules 
are stationary. (This approximation changes the answer a little, but we shall still 
get a reasonable estimate.) We can think of our chosen molecule as a projectile 
moving through a target assembly of all the other stationary molecules, and our 
problem is to find the average distance x it will travel before making a collision. 
The cross section for collisions is given by (14.5) as cr = n(2R) z = 4 jtR 2 . If 
we imagine first a thin slice of the gas with thickness dx (in the direction of our 
projectile’s velocity), then the “target” density of this slice is 

N , 

n far = —dx 
V 

where N is the total number of molecules and V their total volume, so that N / V 
is the number density (number/volume). Thus the probablity that our molecule 
will make a collision in any thin slice of thickness dx is given by (14.1) as 


prob(collision in dx) — 


-dx. 


(14.6) 


Let us denote by prob(x) the probability that the projectile travels a distance 
x without making any collisions. The probability that it travels a distance x 
without colliding and then does collide in the next dx is the product of prob(x) 
and (14.6): 


prob(first collision between x and x + dx) = prob(x) • -^-dx. (14.7) 

On the other hand, this same probabilty is just 

prob(first collision between x and x + dx) = prob(x) — prob(x + dx) 

= - —prob (x)dx. (14.8) 

dx 

Comparing (14.7) with (14.8), we get a differential equation for prob(x): 
d No 

—prob(x) = - — prob(x), { 

dx V 

from which we see that prob(x) decreases exponentially with x, 

prob(x) = e~ (Na/V)x . (14.9) j 

[Here I have used the initial condition that prob(O) is obviously 1.] The mean 
free path is the average value of x (that is, X = (x)), and, to find this average, we 
must multiply x by the probability (14.8) and integrate over all possible values 
of x: 


X = (X) = f°°x \— e '‘•'■•'I dx = — . 

Jo l V ] No 


(14.10) 
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) At STP we know that 22.4 liters of air contain Avogadro’s number of molecules 
j (one mole). Therefore, 

| V 22.4 x 10 -3 m 3 

| A “ N a (4tiR 2 ) ~ (6.02 x 10 23 ) x 4tt(0.15 x 10- 9 m) 2 

= 1.3 x 10“ 7 m = 130 nm. 

We see that the mean free path of an air molecule is considerably more than the 
inter-molecular spacing (around 3 nm) and vastly bigger than the molecular size 
! (around 0.3 nm). 


Different Processes and Targets 

So far we have considered collisions in which the most that can happen is that 
the projectile is deflected by the target and emerges in a different direction. There 
are several other possibilities: Consider a collision between a point projectile and 
a target consisting of a ball of putty. If the putty is sufficiently absorbent, then any 
projectile that hits it will burrow in and never re-emerge. That is, the target will 
capture, or absorb, the projectile. Exactly our previous argument would then give 
the number of projectiles captured as iV cap — A inc n tar a. We can easily make matters 
more complicated. For example, part of the target’s surface could be absorbent and 
part hard. Projectiles that hit the absorbent surface would be captured and those that hit 
the hard part would be scattered (that is, re-emerge traveling in a different direction). 
In this case, we would have two separate relations analogous to (14.4), one to give the 
number of captures and the second for the number of scatterings: 

Neap = Mnc^tar^cap [capture] (14.11) 

and 

N sc = N inc n m cr sc [scattering], (14.12) 

Here cr cap is the area of that part of the target that absorbs the projectile and cr sc the 
area of the part that scatters it. The total cross section of the target is, of course, 

°tot = CT cap + °sc- 

An example of a real collision in which both capture and scattering are possible 
is the collision of an electron and an atom, such as chlorine, that can capture an extra 
electron. In this case, we can no longer identify a particular area of the target that will 
capture the projectile, but it will still be true that the number of projectiles captured is 
proportional to A inc , the number of incident projectiles, and to n tar , the target density. 
Thus we can use the exact same equation (14.11) to define the capture cross section 
cr cap as the relevant constant of proportionality. With this definition you can (and 
should) view a cap as the effective area of the target for capturing the projectile. In 
the same way, we can use (14.12) to define the scattering cross section a sc and then 
view cr sc as the effective area of the target for scattering the projectile. 
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In collisions of electrons and atoms, there are other possibilities besides scattering 
and capture. For example, if the incident electron has enough energy, it may be able 
to ionize the atom, knocking one or more of the atomic electrons free. The number of 
ionizations can be written as 

^ion = ^inc n tar a ion [ionization] (14.13) 

and this defines the ionization cross section <r ion as the effective area of the target atom 
for ionization by the incoming electron. Again, when a neutron collides with a nucleus 
of 235 U, it can cause the nucleus to fission, splitting into two much smaller nuclei 
and releasing some 200 MeV of kinetic energy, and we can define the fission cross 
section <r fis as the effective area of a 235 U nucleus for fission by neutron bombardment, 
satisfying the equation A fis = N inc n tar tx fis . 

There is one other classification that deserves mention. The word “scattering” is 
generally reserved for a process in which the projectile is deflected and moves off, 
leaving behind the same target — the same atom, the same nucleus, or whatever. This 
usage excludes processes like capture, in which the projectile does not emerge from 
the collision at all. It also excludes processes like ionization, in which an electron is 
knocked off the target atom, or like fission, in which the target nucleus is broken into 
pieces. If the internal motions of the target are left unchanged, the scattering is said to 
be elastic; if the internal motions of the target are changed by the collision, then the 
scattering is called inelastic. Consider, for example, the scattering of an electron by a 
stationary atom. To simplify our discussion, let us suppose that the atom is fixed (an 
excellent approximation, since an atom is so much heavier than an electron) and that 
the atomic electrons are initially in their ground state — the lowest possible energy 
level 5 and the atom’s most stable state. When the incoming electron scatters off the 
atom, there are two possibilities: The electron may scatter elastically, emerging with 
its kinetic energy unchanged and leaving the target in its original state of internal 
motion. Or it can scatter inelastically, giving some of its kinetic energy to the atom 
and raising the target’s internal motion to a higher energy level. 6 This latter process of 
atomic excitation was first observed by the German physicists James Franck (1882— 
1964) and Gustav Hertz (1887-1975)and gave compelling evidence for the existence 
of atomic energy levels (and won them the 1925 Nobel Prize). 

When a projectile scatters off a target, we can, if we wish, distinguish between the 
two types of process, elastic and inelastic. The total number of scatterings N sc in a 
given experiment is the sum of the elastic and inelastic scatterings, N sc = N el + iV inel , 
and we can define corresponding cross sections satisfying cr sc = a el + cr inel . 

For a given target and given projectile, we can enumerate all the possible outcomes 
of a collision — scattering, capture, ionization, fission, and so on — and for each 


5 Recall that atoms can exist only in certain discrete “energy levels.” Since an isolated atom 
eventually finds its way to the lowest energy level (ground state), the target atom in a collision is 
most often in its ground state. 

6 In general, both projectile and target can move, and they may both have internal structure and 
hence energy of internal motion — as in the collision of two molecules in a gas. In this case, an elastic 
scattering is defined as one in which the internal motions of both projectile and target are unchanged. 
This means the total kinetic energies r pro j + T tar are the same before and after the encounter. 
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outcome, we can define a corresponding cross section. The sum of all these partial 
cross sections is called the total cross section cr tot . For example, for electrons colliding 
with an atom, it may be that there are just three possible outcomes, scattering, capture, 
or ionization; in this case, the total cross section would be 

°tot = CT sc + °cap + 'Ton- 

By adding up the three equations (14.12), (14.11), and (14.13), we can see that the 
total cross section gives the total number of projectiles removed from the incident 
beam by all three possible processes: 

N tot = N inc « tar cr tot [total]. 

That is, a tot is the effective area of the target for interacting with the projectile in any 
of the possible ways. 

In most cases, the various cross sections we have defined are found to vary with the 
energy of the incident projectile. For an example that is easily understood, consider 
the ionization cross section for an electron colliding with an atom. To ionize the atom, 
the incoming electron must have the minimum energy needed to knock one of the 
atomic electrons free. If the incident electron’s energy is less than this ionization 
energy, then ionization is impossible; therefore, N ion = 0, and the ionization cross 
section <x 10n defined by (14.13) is exactly zero. Above the ionization energy, ionization 
is possible and both N lon and cr ion are normally nonzero. Obviously <r ion varies with 
energy. Although it is not always as obvious, we find in practice that almost all cross 
sections are likewise energy dependent. 


14.4 The Differential Scattering Cross Section 


In defining the cross section <j sc in (14.12) we counted the total number of projectiles 
that were scattered, regardless of the particular direction in which they emerged. 
Obviously we could obtain more information if we chose to monitor these directions 
as well, and this leads us to the notion of the differential cross section, as we now 
discuss. 

To simplify matters, let us consider a collision where the only possible interaction 
is elastic scattering. For example, you could consider the scattering of a point projectile 
off a hard sphere (Figure 14.2) or the Rutherford scattering of a positively charged 
alpha particle off a heavy positive nucleus (Figure 14.1). In either case, there are 
just two possibilities: Either the projectile misses the target entirely and emerges 
unscattered, or it scatters elastically. 7 


7 If the alpha particle of the Rutherford experiment had enough energy, it could also raise 
the nucleus to a higher energy level or knock it apart. However, at the low energies available to 
Rutherford, this was not a possibility, and we can confine our attention to these low energies for now. 
There is another subtlety in Rutherford scattering that is possibly worth mentioning: The Coulomb 
force F — kqQ/r 2 of a truly isolated nucleus would extend all the way to infinity, and all alpha 
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If we are to monitor how many particles emerge in a given direction, we must 
agree on how to measure directions. It is standard to take the direction of the incident 
beam as our z axis, and then to specify the direction of any one scattered projectile 
by giving its polar angles 0 and 0. Since these angles form an infinite continuum, we 
cannot speak of the number of particles scattered in the exact direction (0,0). Rather 
we must count the number of particles emerging in some narrow cone around (0,0). 
To characterize the size of this narrow cone, we use the notion of solid angle, which 
is defined as follows. 


Solid Angle 

To understand the definition of the solid angle of a cone it helps to recall the definition 
of the ordinary angle between two lines in a plane. This is illustrated in Figure 14.6(a): 
If the two lines meet at O, we draw a circle of any convenient radius r centered at O. 
The two lines define an arc of length s on the circle, and we define the angle A0 (in 
radians) as A0 = s/r. (Since s is proportional to r, this definition is independent of 
our choice of r.) In a similar way, if a three-dimensional cone has its apex at O, we 
draw a sphere of radius r centered at O as in Figure 14.6(b). The cone intersects the 
sphere in a spherical surface of area A (proportional to r 2 ), and we define the solid 
angle of the cone as 


AQ = A/r 2 . 


( 14 . 14 ) 



Figure 14.6 (a) The ordinary two-dimensional angle Ad subtended by an arc length 

s of a circle is defined as AO — s/r where r is the radius of the circle and AO is in 
radians, (b) The solid angle A£2 of a cone subtended by an area A on a sphere is 
defined as = A/r 2 . Here r is the radius of the sphere and A£2 is in steradians. 


particles, however large their impact parameter, would be deflected a tiny bit; in other words, all of 
the incident alphas would be scattered. In practice, however, the Coulomb force of the nucleus is 
always screened at large distances by the atomic electrons, and those alphas that pass outside the 
whole atom are not scattered. 
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The unit of solid angle defined in this way is called the steradian, abbreviated sr. If the 
cone includes all possible directions, then since the area of the whole sphere is 47rr 2 , 
the solid angle is An. That is, the solid angle corresponding to all possible directions 
in three dimensions is 47r steradians, just as the ordinary angle corresponding to all 
directions in two dimensions is 2n radians. The cone shown in Figure 14.6(b) was a 
circular cone, but the definition A £2 = A/r 2 works equally for any shape of cone. For 
instance, we shall need to consider the narrow cone with polar angles in the ranges 0 
to 9 + d0 and 0 to 0 + d(p\ this cone intersects the sphere in a “rectangular” surface 
of area r 2 sin 6 dd d<f>, and so has solid angle 


dQ = sinO d0d<t>. 


(14.15) 


The Differential Cross Section 

Armed with the notion of solid angle, we are ready to define the differential scattering 
cross section. We imagine our usual experiment, in which a large number A inc of 
projectiles are directed at a target assembly with density n tar . For any chosen cone 
of solid angle dQ in any chosen direction (0,0), we now monitor the number of 
projectiles scattered into this dQ, as sketched in Figure 14.7. We denote this number 
by A sc (into dQ), and, by the familiar argument it must be proportional to A inc and to 
n tar , so can be written as 

A sc (into dQ) = A inc n tar c?cr(into dQ) (14.16) 

where da (into dQ) is the effective cross-sectional area of the target for scattering into 
the solid angle dQ. Since this is proportional to dQ, it is traditional to rewrite it as 

da 

daiinXo dQ) = —dQ 
dQ 

where the factor da/dQ is called the differential scattering cross section. In terms 
of it, we can rewrite (14.16) as 


( 6 , 0 ) 



Figure 14.7 Projectiles are incident from the left on a rectangular target 
assembly, and we monitor the number 7V sc (into dQ) emerging into a cone 
of solid angle dQ in the direction (0,0). 
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A^Onto dQ) = n m ~(0, <P)dQ 
dQ 


(14.17) 


where I have added the argument ( 9 , (p) to emphasize that the differential cross section 
will (in general) depend on the direction of observation. Equation (14.17) can be taken 
as the definition of the differential cross section do/dQ. It is the experimentalist’s job 
to measure do/dQ using (14.17), and it is the theorist’s job to predict do/dQ based 
on some assumed model of the interactions between the projectile and the target. 

If we add up the numbers /V sc (into dQ) for all possible solid angles dQ we will 
recover the total number of scatterings, N sc . That is, integrating (14.17) over all solid 
angles will give N sc . Since N sc = N mc n tax o, where a is the total scattering cross 
section, we conclude that 


o = J ^(0,(t>)dQ = sin 9d9 d(p^(9,<P) (14.18) 


where the second form follows from (14.15). That is, the total cross section 8 is the 
integral over all solid angles of the differential cross section. 


example 14.4 Angular Distribution of Scattered Neutrons 

At an incident energy of several MeV (million electron volts), the differential 
cross section for scattering of neutrons off a heavy nucleus might have the form 

— (0, (p ) = or 0 (l + 3cos0 + 3 cos 2 0) (14.19) 

dQ 

where cr 0 is a constant that could be about 30 millibams per steradian (mb/sr). 
Describe the angular distribution of the scattered neutrons and find the total 
scattering cross section. 

The most prominent feature of (14.19) is that it is independent of <p. (As we 
shall see in the next section, this is a fairly common occurrence.) This means 
that the distribution of scattered neutrons is axially symmetric, which makes 
visualization of the angular distribution much simpler, since we have only to 
worry about its dependence on 9. Figure 14.8 shows do/dQ as a function of 0 
from 6 = 0 to n . We see that do/dQ is largest at 0 — 0. That is, in this example 
at least, a particle that is scattered is most likely to be scattered near to the 
“forward direction” 0=0. (In quantum scattering, especially at high energies, 
this is often the case.) In this example, there is also a much smaller maximum 
in the backward direction at 0 = n, and the probability for scattering directly 
back at 9 = n is greater than for any other direction in the backward hemisphere 
7r/2 < 9 < 71. 


8 In general, one should say the total scattering cross section, cr sc ; here we are assuming that 
scattering is the only possible interaction, so they are the same. 
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Figure 14.8 The differential cross section (14.19) for scattering of 
neutrons off a nucleus, plotted as a function of the scattering angle 
6. The vertical axis shows do/dQ in units of millibams per steradian. 


The total scattering cross section is found by integrating the differential cross 
section, as in (14.18). Since the integrand is independent of 0, the integration 
over (j) is trivial and gives a factor of 2n, so 


a = 


/ 


^-(9,4>) dQ — 2rca 0 I sin# d9(\ + 3cos# + 3cos 2 #) 
d£l Jo 


= $ncr 0 = 754 mb. 


(14.20) 


14.5 Calculating the Differential Cross Section 


To simplify the calculation of the differential cross section, I shall assume that the 
scattering is axially symmetric. This is certainly the case if the target is spherically 
symmetric (like the hard sphere of Figure 14.2 or any target which exerts a spherically 
symmetric force field), since spherical symmetry implies axial symmetry. It means that 
the differential cross section is independent of (j) and allows us to include all different 
values of <p in our discussion at the same time. We imagine a projectile incident on the 
target with impact parameter b. By calculating the projectile’s trajectory, we can, in 
principle at least, find the corresponding scattering angle 6 = 6(b) as a function of b. 
Alternatively, by solving for b, we can express b as a function of #, that is, b — b(9). 

Let us next consider all those projectiles that approach the target with impact 
parameters between b and b + db. These are incident on the annulus (the shaded 
ring shape) shown on the left of Figure 14.9. This annulus has cross sectional area 

da=2nbdb. (14.21) 

These same particles emerge between angles # and # + d6 in a solid angle 


dQ = 2n sin 9d9, 


(14.22) 
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Figure 14.9 All projectiles incident between b and b + db are scattered be¬ 
tween angles 6 and 9 + dO. The area on which these particles impinge is 
da - 27 rb db, and the solid angle into which they scatter is dQ. = 2jt sin 6 d6. 


as indicated in Figure 14.9. The differential cross section do/dQ is now found by 
simply dividing (14.21) by (14.22) to give 


da_ _ 
dQ Z 


b db 
sin B d9 


( 14 . 23 ) 


where I have inserted the absolute value signs to ensure that do/dQ is positive. 
(Because 0 often decreases as b increases, db/dO may be negative.) 

In summary: To calculate the differential cross section for a projectile scattering off 
a given target, we must first calculate the projectile’s trajectory, to find the scattering 
angle 6 as a function of the impact parameter b (or vice versa). Then do I dQ, is found 
by simply differentiating b with respect to 0, as in (14.23). 


example 14.5 Hard Sphere Scattering 

As a first example of the use of (14.23), find the differential cross section for 
scattering of a point projectile off a fixed rigid sphere of radius R. Integrate your 
result over all solid angles to find the total cross section. 

Our first task is to find the trajectory of a scattered projectile, as shown in 
Figure 14.10. The crucial observation is that when the projectile bounces off 
the hard sphere, its angles of incidence and reflection (both shown as a in the 
picture) are equal. (This “law of reflection” follows from conservation of energy 
and angular momentum — see Problem 14.13.) Inspection of the picture shows 
that the impact parameter is b — R sin a, and the scattering angle is 6 = n — 2a. 
Combining these two equations we find that 

b = R sin ~~~ = /?cos(0/2), 


(14.24) 
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Figure 14.10 A point projectile bouncing off a fixed rigid sphere 
obeys the law of reflection, that the two adjacent angles labelled a are 
equal. The impact parameter is b = R sin a, and the scattering angle is 
0 = 7t — 2a. 

and from (14.23), we find the differential cross section 


da_ _ _b_ db _ R cos(#/2) R sin(#/2) _ R* 
dQ, sin# dO sin# 2 4 


(14.25) 


The most striking thing about this result is that the differential cross section is 
isotropic; that is, the number of particles scattered into a solid angle d£l is the 
same in all directions. To find the total cross section, we have only to integrate 
this result over all solid angles: 

or — f —— [ —dQ = nR 2 , 

J dQ J 4 

which is, of course, the cross-sectional area of the target sphere. 


14.6 Rutherford Scattering 


Perhaps the most famous collision experiment of all time was Rutherford’s experi¬ 
ment, in which he and his assistants observed the scattering of alpha particles off the 
gold nuclei in a thin gold foil and used the observed distribution to argue for the nu¬ 
clear model of the atom. According to this model, the force of a nucleus (charge Q) 
on an alpha (charge q) is 


F=z kqQ_ Y_ 

r 2 r 2 


(14.26) 


The alphas are scattered appreciably only if they approach close to the nucleus, well 
inside the orbiting atomic electrons. Therefore we can ignore the force of the latter, 
and (14.26) is the only force on the alphas. Therefore, as we saw in Chapter 8, the 
orbit of an alpha is a hyperbola, with the nucleus (which we’ll treat as fixed for the 
moment) at its focus, as shown in Figure 14.11. If u denotes the unit vector pointing 
from the target to the alpha’s point of closest approach, the orbit is symmetric about 
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Figure 14.11 The Rutherford scattering of an alpha particle off a fixed 
atomic nucleus. The orbit is a hyperbola, which is symmetric about the 
line labelled by the fixed unit vector u. The position of the particle can 
be labelled by its angle \fj measured from u. As the particle moves away 
(t -> oo), 4* -> 4/ 0 , and as t — oo, xfr -» —r[f 0 . Therefore the scattering 
angle is 0 = n — 2x/f 0 . 


the direction of u, and it is convenient to label the alpha’s position by the polar angle 
xfr, measured from u (see Figure 14.11). Let us denote by \j/ 0 the limit of xjr as the 
scattered alpha moves far away, so that the total angle subtended by the alpha’s orbit 
is 2i/r 0 and the scattering angle is 


6=7 r - 2x/s 0 . (14.27) 

Our job now is to relate the scattering angle 6 to the impact parameter b. We can 
do this by evaluating in two ways the change in the momentum of the projectile, 

Ap = p— p, (14.28) 

where p and p' are the momentum long before and long after the encounter. First, by 
conservation of energy, p and p' have equal magnitudes, so that the triangle shown in 
Figure 14.12 is isosceles, and 


| Ap| =2p sin(0/2). (14.29) 

On the other hand, from Newton’s second law, Ap = / F dt. Comparing Figures 
14.12 and 14.11, you can see that Ap is in the same direction as the unit vector u. Thus 



\ |Ap| = 2psin(0/2) 


Figure 14.12 The change in momentum of the projectile is Ap = p' — p. 
Since, |p| — jp'|, it is easily seen that | Ap| = 2 p sin((9/2). 
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the magnitude of A p is given by the same integral, with F replaced by its component 
F u in the direction of u. 


|Ap| = 


/: 


F u dt. 


From Figure 14.11 you can see that F u = ( y/r 2 ) cos xj/. Using the now-familiar trick, 
we can write dt = dx/s/iff, where, since mr 2 xj/ = i = bp (see Figure 14.11 again), we 
can replace xjf by bp/mr 1 . Putting all of this together, we find 



y cosi/r dxjr 
r 1 bp/mr 2 


2si„^«2pcos W 2). 

bp bp 


(14.30) 


[To understand the limits in the integral, recall that as t ±oo, so xj/ -> ±i/r 0 . In the 
last step I used (14.27) to replace xj/ 0 by (jt — 0)/2 and hence sin xj/ Q by cos(0/2).] 
Equating the two expressions (14.29) and (14.30) for | Ap|, we can solve for b to give 


ym cos(0/2) 
p 2 sin (9/2) 


= —■= cot (0/2) 
mv z 


(14.31) 


where in the last equality I replaced p by mv, and v is the projectile’s incident speed. 

Having found the impact parameter b as a function of the scattering angle 9, we 
can now use the result (14.23) to give the differential cross section 


da_ _ _1_ b db 
dQ sin0 d9 


_I_ 

2 sin (0/2) cos(0/2) 


cot(0/2) • 

mr 


y i 

mu 2 2 sin 2 (0/2) 


or, replacing y by kq Q, 


do _ ( kqQ \ 2 
dQ \4£ sin 2 (0/2)/ 


(14.32) 


where E is the energy of the incident projectiles, E — \mv 2 . This is the celebrated 
Rutherford scattering formula. It gives the differential cross section for scattering of 
a charge q, with energy E , off a fixed target of charge Q. While it is still today, nearly 
a century after its derivation by Rutherford, a much-used result, its great historical 
importance is that it was used to prove the existence of the atomic nucleus, as we now 
discuss briefly. 9 


9 Since the atom is a microscopic system, for which quantum, not classical, mechanics should 
be used, you may be surprised that the classical Rutherford formula worked so well for Rutherford 
and his assistants. It is one of the most amazing accidents in the history of physics that the quantum 
formula for scattering of two charged particles agrees exactly with Rutherford’s classical formula. 
(This is certainly not true for other force laws.) 
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The Experiment of Geiger and Marsden 

The best known and most important Rutherford-scattering experiment was performed 
by Rutherford’s assistants Hans Geiger (inventor of the Geiger counter, 1882-1945) 
and Ernest Marsden (1889-1970) and published in 1913. Their goal was to test 
Rutherford’s “planetary” model of the atom, according to which most of the atomic 
mass is concentrated in a tiny, positively charged nucleus. 10 As we have seen, this 
model leads to the cross section (14.32) for scattering of alpha particles, with several 
very specific predictions: The scattering probablility should be inversely proportional 
to sin 4 6 /2, inversely proportional to the energy squared E 2 , and proportional to the 
nuclear charge squared ( Q 2 ). Geiger and Marsden were able to verify all of these 
predictions with amazing precision, and hence to contribute to the rapid acceptance of 
Rutherford’s nuclear atom. They used alpha particles coming from radon gas (“radium 
emanation” as it was called then), with energy around 6.5 MeV. (1 MeV = 10 6 electron 
volts, and 1 eV = 1.6 x 10 -19 joules.) They directed a narrow “pencil” of these at a thin 
metal foil and counted the scattered particles using a small zinc sulphide screen. Any 
alpha particle striking this screen caused a tiny flash of light or “scintillation,” which 
could be observed through a microscope. In this way, it was possible to count up to 
about 90 alpha particles per minute (a job needing great patience and concentration!). 
To observe the angular dependence of the scattering, they could swing the screen 
and microcope around to angles in the range 5° <0 < 150°. To test the dependence 
on incident energy, they passed the incident particles through thin sheets of mica, to 
slow them down and hence vary their energy. And to test the dependence on nuclear 
charge, they used various different target foils (gold, platinum, tin, silver, copper, and 
aluminum). 


example 14.6 Angular Dependence 

To isolate its angular dependence write the Rutherford cross section (14.32) as 


^( 0 ) = a ^ E) 

dQ sin 4 6/2 


(14.33) 


and find o 0 (E) for scattering of 6.5 MeV alphas off gold. Find the differential 
cross section at 150° and 5° (Geiger and Marsden’s largest and smallest angles). 
Find the number of alphas they would have had to count in a minute assuming the 
following values: The number of incident alphas in one minute, N mc = 6 x 10 8 ; 
the thickness of the gold foil, t = 1 /xm; area of zinc sulphide screen = 1 mm 2 ; 
and distance of screen from target = 1 cm. Make a useable plot of the differential 
cross section as a function of scattering angle 0. 


10 Initially, the sign of the nuclear charge (positive or negative) was not clear, but it was soon 
found to be positive, with an equal negative charge carried by the orbiting electrons. 
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The charge of the alpha particle is q = 2e and that of the gold nucleus is 
Q = 79c, so 


Oo(£) - ( 


2 x 79 x ke 2 \ 


4 E 


- 


This is easily evaluated in SI units, though a slicker way is to use the useful 
combination ke 2 — 1.44 MeVfm (where fm stands for femtometer or 10~ 15 m). 
Either way, we find that 


a 0 = 76.6 x 10 30 m 2 /sr = 0.766 bams/sr. 


| Substituting into (14.33) we get 

| 

— (150°) = 0.88 bams/sr and ^-(5°) = 2.1 x 10 5 bams/sr. (14.34) 
j dQ dQ 

The huge difference between these — more than 5 orders of magnitude — 
presents considerable practical difficulties, as we shall see. Before we can sub¬ 
stitute into (14.17) to give the actual numbers counted we need to calculate 
n tar and dQ. As usual, we can find n tar in terms of the density of gold (specific 
gravity 19.3) and its atomic mass (197): 

(19.3 x 10 3 kg/m 3 ) x fl 0 - 6 m) = x 1q22m - 2 
| ' m 197 x 1.66 x 10 27 kg 

Geiger and Marsden’s screen had area A = 1 mm 2 and was at a distance r = 10 
mm from the target. Therefore, it subtended a solid angle 

dQ = 4 = 0.01 sr. 
r 2 

Putting all of this together, we find for the number of alphas hitting their screen 
at 150° in a minute 

/V sc (at 150°) = /Vi nc n tar —(150 °)dQ 
dQ. 

= (6 x 10 8 ) x (5.90 x 10 22 m~ 2 ) x (0.88 x 10~ 28 m 2 /sr) x (0.01 sr) 
= 31, 

a number that they could count easily and accurately. On the other hand, the 
same calculation gives 

iV sc (at 5°) = 7.5 x 10 6 , 

a number that they could not possibly count or even estimate. Obviously measur¬ 
ing the cross section at small angles required them to use a much, much weaker 
source than at large angles. 

Because of the huge variation of the cross section as the scattering angle 
varies, a straightforward linear plot of da/dQ is not especially useful. If we 
choose a scale to show the small angles, the cross section for large angles will 
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Figure 14.13 Semilog plot of the Rutherford differential cross section as a 
function of angle 9. The dots are the measurements of Geiger and Marsden. 


appear to be zero; if we arrange to show the large angles, the cross section for 
small angles will disappear off scale. The solution to this is to make a semilog 
plot, that is, to graph the log of the cross section against 9. This gives the curve 
shown in Figure 14.13, where you can see clearly the variation by more than 
5 orders of magnitude between 15° and 180°. The dots in this figure are the 
original data of Geiger and Marsden and show clearly why Rutherford’s model 
of the atom gained such quick acceptance. 


14.7 Cross Sections in Various Frames* 


* As usual, sections marked with an asterisk can be omitted on a first reading. 

For the most part, we have so far discussed collisions in which the target particle is 
fixed. While this is an excellent approximation if the target is very heavy compared 
to the projectile (as in scattering of electrons off an atom, for example), we must 
nevertheless recognize that there is no such thing as a truly fixed particle, and we 
must learn to treat collisions of two particles both of which can move. Fortunately, we 
already know how to do this: If we observe the motion in the CM frame (the reference 
frame where the center of mass is at rest), then the motion of the relative coordinate 
r = rj — r 2 is precisely the same as that of a single particle with mass equal to the 
reduced mass /i = m x m 2 /M. Thus, if we view the collision in the CM frame, then 
our problem is reduced back to the motion of a single “equivalent particle” in a fixed 
force field. The only remaining difficulty is this: The CM frame is usually not the 
frame in which we do experiments. Thus we must learn how to relate cross sections 
calculated in the CM frame to their corresponding values in the lab frame, the frame 
of the laboratory in which the experiment is to be performed. In particular, we are 
going to want to find the relation between the differential cross sections ( da/dQ) cm 
of the CM frame and the corresponding (dcr/dQ) lab measured in the lab. 
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The CM Variables 

Before we take up the problem of translating between frames, I need to mention 
two more elegant features of the CM frame. First, recall that the Lagrangian for two 
particles, when written in terms of the CM and relative coordinates, has the form 
(8.13) 

L = |MR 2 + - U (r). (14.35) 

Differentiating £ with respect to the three components of r, we find that the general¬ 
ized momentum corresponding to r is 


p = \x r. (14.36) 

That is (as you would probably have guessed) the momentum for the relative motion 
is just that of a single particle of mass /x and velocity r. 

The second property concerns the momenta of the two particles, as measured in 
the CM frame. Recall from (8.9) that 

l*! = R + — r and r 2 = R — r. (14.37) 

1 M M 

We can differentiate these to get the two velocities. In particular, in the CM frame, 
R = 0, so we find that fj = (m 2 /M)r. Multiplying by m h we find for the momentum 
of the projectile = fir = p. That is, in the CM frame, the projectile’s momentum 
p, is the same as the momentum p of the relative motion. In the same way we can 
prove that p 2 = — p, so that 


p, = -P 2 = p = jxt (in the CM frame). (14.38) 


The first equality here confirms that the total momentum in the CM frame is zero. The 
second is useful in the evaluation of cross sections: In measuring the differential cross 
section, we count the number of times the projectile emerges with its momentum pj 
inside some solid angle dQ. Since pj = p (in the CM frame), this is the same thing as 
the number of times the relative momentum p emerges in dQ. Therefore we can find 
the differential cross section in the CM frame just as if a single particle of mass ix were 
scattering off a fixed target. Thus, for example, the Rutherford formula (14.32), for 
scattering of one particle off a fixed target, also gives the differential cross section for 
scattering of two charged particles in their CM frame, provided we replace m by [x. 


General Relation between Cross Sections in Different Frames 

In the CM frame the projectile and target approach one another with equal and 
opposite momenta. In the lab frame of the traditional collision experiment (such as 
the Rutherford experiment) the target is initially at rest. In many modem colliding- 
beam experiments the projectile and target are both moving, in opposite directions. 





Section 14.7 Cross Sections in Various Frames* 

In all of these cases, the initial momenta are collinear, and to simplify matters, I shall 
restrict our discussion to this case. 11 

To see how the cross section transforms between two different frames, we have 
only to look at its definition. Let us start with the total cross section, which was defined 
in (14.2) so that 


N sc = N mc n tar a. 

(We’ll continue to assume that the only possible outcome of a collision is elastic scat¬ 
tering, so we don’t have to worry about other processes like absorption or ionization.) 
We can use this same definition in either frame. Thus we define the CM total cross 
section a cm by 


JC = ff cm (14.39) 

where all four quantities are measured in the CM frame. In exactly the same way we 
define cr lab by 


= A^ ab cr lab (14.40) 

where all four quantities are measured in the lab frame. While any particular scattering 
event may look very different as viewed from the two different frames, the total 
number of events must be the same in either frame. Therefore, 

N cm = Ad ab 
sc sc ' 

In the same way, the number of incident particles is the same, as seen in either 
frame, so that = Af lab . The target density n tar is the density of target particles 
(number/area) seen from the incident direction, as illustrated in Figure 14.4. Since 
this is unaffected by any forward (or backward) motion of the target, n ™ = «[ ab . 
Comparing now (14.39) and (14.40), we see that each of the first three terms of (14.39) 
is equal to the corresponding term in (14.40). Therefore the final terms must also be 
equal, and we have the elegantly simple result that the total scattering cross sections 
in the CM and lab frames are equal, 

a cm = °iab [total scattering cross section]. (14.41) 

If other outcomes were possible, such as absorption or ionization, then exactly the 
same argument would lead to exactly corresponding results; for example, the total 
absorption cross section is the same in the CM and lab frames. 

The differential cross section is a little more complicated. The scattering angle, 
measured as 6 cm in the CM frame, will generally have a different value # lab in the lab 
frame, and a given solid angle measured as d£2 cm in the one will be measured as d£2 lab 


11 Some colliding beams are not perfectly collinear. Some experiments in atomic physics use 
beams that intersect at a large angle, and in the collisions of molecules in a gas, the two particles 
can approach one another at any angle. Although such oblique collisions are not especially difficult 
to handle, I shall consider only collinear collisions here. 
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in the other. Apart from these complications, we can use the same argument as before. 
The definition of the differential cross section (14.17), 

da 

A sc (into di 2) = N inc n tar — dQ, (14.42) 

d£l 

can be used in either frame. Just as before, the numbers N inc and n tar have the same 
values in either frame. Further, the number scattered in any chosen solid angle, called 
d£2 cm , in the CM frame is the same as the number going into the corresponding solid 
angle, called dQ [&h , in the lab frame. Thus, as before, the first three terms in (14.42) 
have the same values in both frames, and the same must therefore be true of the final 
product; that is 


or 



(da _\ _ (da\ dtt cm 

\d£2/i ab \df2/ cm d£2 lab 


(14.43) 


(14.44) 


We see that the differential cross sections are not the same in the two frames, but only 
because a given solid angle has different values (d£l cm and d£2 lab ) according to the 
frame used. 

Since dQ = sin 6 dO d(j) = —d {cos 6) d(j), and since the azimuthal angle <f> of the 
outgoing momentum is the same in both frames, we can rewrite (14.44) in the more 
useful, if perhaps less transparent, form 


f-) * 


/ da\ l^cosflU)] 
\dQ/ CT 


\d( cos4 b )| 


(14.45) 


(The absolute value signs are needed since both cross sections are by definition 
positive, whereas the derivative on the right can sometimes be negative.) The problem 
of transforming the differential cross section from the CM to the lab frame is now 
reduced to the kinematic problem of finding 6 cm in terms of 0 lab , or vice versa, and 
then taking the indicated derivative. 


14.8 Relation of the CM and Lab Scattering Angles* 


* As usual, sections marked with an asterisk can be omitted on a first reading. 

To relate the CM and lab scattering angles, we need to look at the momenta of the 
particles in both frames. Let us add subscripts “cm” and “lab” for these, and use a 
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(a) CM frame (b) lab frame 

Figure 14.14 (a) An elastic collision as seen in the CM frame. The two 
particles approach the CM (shown as a cross) with equal and opposite 
momenta, (b) The same collision as seen in the lab frame, where particle 2 
is initially at rest. 

prime to indicate the outgoing values (and “unprime” for the incoming). The CM 
momenta are given by (14.38) as 

Pcmi = Pcm2 = P (initial) (14.46) 

and 

P'cml = -P / Cm2 = P' (Anal) (14.47) 

where, as usual, p and p' denote the relative momentum (p = /xr). These values are 
illustrated in Figure 14.14(a). Notice that, by conservation of energy, all four momenta 
have equal magnitudes in the CM frame. 

To be definite, I shall confine our discussion to the case that the “lab frame” 
is the traditional lab frame, in which particle 2 (the target) is initially at rest. The 
various momenta as seen in this frame are illustrated in Figure 14.14(b). To find these 
momenta, we can return to the two equations (14.37): 

rv = R + —r and r 2 = R — —r. (14.48) 

1 M M 

Since particle 2 is initially at rest, the second of these implies that 

iU!!W4 = JL (14.49) 

M m 2 m 2 

This is the velocity of the center of mass, as seen in the lab frame, and allows us 
to relate any of the lab momenta to their corresponding CM values. In particular, 
differentiating the first of Equations (14.48), we find that 

- . m i 

Piabi = m i r r = m i R + M r = p + P 
m 2 

or 


Plabl = *P + P 


(14.50) 
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Figure 14.15 The relation of initial and final momenta in the CM and lab 
frames. The number X is the mass ratio m 1 /m 2 and was chosen to be 0.5 
for this picture. The momenta shown as p and p' are the initial and final 
relative momenta, which are the same as p cml and p' cml . 


where I have introduced the important mass ratio 


m 2 

In exactly the same way, the final momentum p' labl is 
P'labl = + P'- 


(14.51) 


(14.52) 


The two results (14.50) and (14.52) are illustrated in Figure 14.15, where the lines 
BC and B D represent the initial and final momenta of particle 1 in the CM frame, 
while AC and AD are the corresponding lab values. By dropping a perpendicular 
from the point D to the line AC, you can check (Problem 14.25) that 


k + cos 9 cn 


(14.53) 


which tells us 9 h[b in terms of 9 cm . Before we use this result to give us the lab cross 
section in terms of the corresponding CM value, let us use Figure 14.15 to establish 
a few other results. 

Because |p| = |p'|, the point D lies on a circle with center B and radius p, as 
indicated. It is easy to see that, unless 9 cm = 0 or n, 9 lab is always less than 9 cm , as 
you might expect. The details of Figure 14.15 depend on the relative sizes of the 
two masses: Suppose first that X < 1 (that is, the projectile is lighter than the target, 
mj < m 2 ). In this case, the point A lies inside the circle as shown in Figure 14.15. (That 
figure was drawn using a mass ratio X = 0.5.) If 9 cm = 0, then the point D coincides 
with C and 9 lah is zero as well. If we imagine 0 cm increasing continuously from 0 
to jt, the point D moves continuously around the semicircle from C to E, with 0 lab 
always less than 0 cm , until 9 cm = n, at which point 0 lab is also equal to n. (That is, if 
the projectile bounces straight back in the CM frame, the same is true in the lab — 
at least if X < 1.) This behavior is illustrated in Figure 14.16, which is a graph of 0 lab 
against 9 cm for a mass ratio X = 0.5. 
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e cm -> 


Figure 14.16 The lab scattering angle 9 lab as a function of the CM 
angle 9 cm (14.53) for a mass ratio of 0.5 (that is, m, = 0.5m 2 ). The two 
angles are equal at 0 and n, but 0 lab < 9 cm everywhere else. 


If A = 1 (that is, the projectile and target have equal masses), the behavior of f9 lal 
as a function of 6 cm is surprisingly different. In this case, (14.53) reduces to 


(see Problem 14.27). Thus, as 0 cm varies from 0 to tt, 6 ]ah runs from 0 to n/2; in 
particular, in the lab frame of an equal-mass collision, the scattering angle can never 
exceed 90°. If A > 1, the situation is different again, as you can explore in Problem 
14.31. 

To find the lab differential cross section, we need to find the derivative d (cos <9 cm ) / 
d(cosQy ah ) in (14.45). Using (14.53) it is a reasonably straightforward exercise (Prob¬ 
lem 14.26) to show that 

<f(cos6> lab ) _ 1 + A cos 6 cm (14 55) 

d (cos 9 cm ) (1 + 2A cos 9 cm + A 2 ) 3 / 2 ' 

Substituting into (14.45), we find that 


example 14.7 Hard Sphere Scattering Again 

Find the CM and lab differential cross sections for scattering of a point projectile 
of mass m l off a hard sphere of radius R and mass m 2 = 2 m h and plot each as 
a function of the appropriate scattering angle. 

The CM cross section is the same as that for a particle with mass equal to 
the reduced mass /x, scattering off a fixed target. In Example 14.5 (page 573) 
we found the latter to be just R 2 /4. Therefore, 

/ da \ _ R 2 

\dn) cm ~ T 
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Figure 14.17 (a) The differential cross section for scattering off a hard 

sphere is isotropic in the CM frame, (b) In the lab frame, it is peaked in 
the forward direction. 


and the lab cross section can be written down immediately from (14.56) (with 
% = 0.5). The two cross sections are plotted as functions of their respective 
angles in Figure 14.17. 12 As we already knew, the CM cross section is isotropic. 
The lab cross section is markedly skewed in the forward direction. 


Principal Definitions and Equations of Chapter 14 _ 

The Scattering Angle and Impact Parameter 

The scattering angle is the angle 6 by which a projectile is deflected in its encounter 
with a target. The impact parameter is the distance b by which the projectile would 
have missed the center of the target if it had been undeflected. [Section 14.1] 

The Collision Cross Section 

The cross section cr 0C for a particular outcome “oc” (elastic scattering, absorption, 
reaction, fission) is defined by 

N oc = N mc n Vdr a oc [Sections 14.2 & 14.3] 

where N oc is the number of outcomes of the type considered, N mc is the number of 
incident projectiles, and n tar is the density (number/area) of targets. 

The Differential Cross Section 

The differential cross section —(6, 0) for scattering in a direction (9, 0) is de- 
fined by 


12 Equation (14.56) gives the lab cross section as a function of the CM angle 6 cm . To express it 
as an explicit function of 6» lab , one would have to solve (14.53) for d cm in terms of 8 lab . To make the 
plot of Figure 14.17(b), a much simpler procedure is to treat both 6> lab and (do/dQ) lab as functions 
of the parameter 9 cm and make a parametric plot with 8 cm running from 0 to it. 




Problems for Chapter 14 


587 


A sc (into dQ) = A inc n tar 


~(e,<p)dQ. 


[Eq.(14.17)] 


Calculating the Differential Cross Section 


If you can find the scattering angle 0 as a function of the impact parameter b (or vice 
versa), then 


do _ b db 
dQ sin# d6 


[Eq. (14.23)] 


The Rutherford Formula 


The differential cross section for scattering a charge q off a fixed charge Q is given 

by the Rutherford formula 


= ( k( iQ \ 2 

dQ \4Esin 2 (#/2)/ 


[Eq. (14.32)] 


The CM and Lab Cross Sections 


The lab frame is generally understood to be the frame in which the target is at rest; 
the CM frame is that in which the CM is at rest. The differential cross sections in the 
two frames satisfy 


( da \ _(d?\ d(C0S # em ) 
\dQj lab \dQj cm d(cos# lab ) 


[Eq. (14.45)] 


Problems for Chapter 14 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (irk*). 

section 14.2 The Collision Cross Section 

14.1 ★ A blueberry pancake has diameter 15 cm and contains 6 large blueberries, each of diameter 1 
cm. Find the cross section a of a blueberry and the “target” density « tar (number/area) of berries in the 
pancake, as seen from above. What is the probability that a skewer, jabbed at random into the pancake, 
will hit a berry (in terms of o and n tar , and then numerically)? 

14.2* (a) A certain nucleus has radius 5 fm. (1 fm = 10~ 15 m.) Find its cross section o in bams. (1 
bam = 10 -28 m 2 .) (b) Do the same for an atom of radius 0.1 nm. (1 nm = 10 -9 m.) 

14.3 ★ A beam of particles is directed through a tank of liquid hydrogen. If the tank’s length is 50 cm 
and the liquid density is 0.07 gram/cm 3 , what is the target density (number/area) of hydrogen atoms 
seen by the incident particles? 

14.4 ** The cross section for scattering a certain nuclear particle by a copper nucleus is 2.0 bams. 
If 10 9 of these particles are fired through a copper foil of thickness 10 pm, how many particles are 
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scattered? (Copper’s density is 8.9 gram/cm 3 and its atomic mass is 63.5. The scattering by any atomic 
electrons is completely negligible.) 

14.5 ** The cross section for scattering a certain nuclear particle by a nitrogen nucleus is 0.5 bams. If 
10 11 of these particles are fired through a cloud chamber of length 10 cm, containing nitrogen at STP, 
how many particles are scattered? (Use the ideal gas law and remember that each nitrogen molecule 
has two atoms. The scattering by any atomic electrons is completely negligible.) 

14.6 ** Our definition of the scattering cross section, N sc = N m( .n m a, applies to an experiment using a 
narrow beam of projectiles all of which pass through a wide target assembly. Experimenters sometimes 
use a wide incident beam, which completely engulfs a small target assembly (the beam of photons from a 
car’s headlamp, directed at a small piece of plastic, for example). Show that in this case N sc = n mc N tar a , 
where n inc is the density (number/area) of the incident beam, viewed head-on, and 7V tar is the total 
number of targets in the target assembly. 

section 14.4 The Differential Scattering Cross Section 

14.7 * Calculate the solid angles subtended by the moon and by the sun, both as seen from the earth. 
Comment on your answers. (The radii of the moon and sun are R m = 1.74 x 10 6 m and R s = 6.96 x 10 8 
m. Their distances from earth are d m = 3.84 x 10 8 m and d s = 1.50 x 10 11 m.) 

14.8 * In their famous experiment, Rutherford’s assistants, Geiger and Marsden, detected the scattered 
alpha particles using a zinc sulphide screen, which produced a tiny flash of light when struck by an 
alpha particle. If their screen had area 1 mm 2 and was 1 cm from the target, what solid angle did it 
subtend? 

14.9 * By integrating the element of solid angle (14.15), dQ. = sin 0 d6 d<p, over all directions, verify 
that the solid angle corresponding to all directions is 4 jr steradians. 

14 . 10 * By evaluating the necessary integral, verify the result (14.20) for the total cross section of 
Example 14.4 (page 571). (This is very easy if you change variables to u = cos 6.) 

14.11 ** The differential cross section for scattering 6.5-MeV alpha particles at 120° off a silver 
nucleus is about 0.5 barns/sr. If a total of 10 10 alphas impinge on a silver foil of thickness 1 /rm and 
if we detect the scattered particles using a counter of area 0.1 mm 2 at 120° and 1 cm from the target, 
about how many scattered alphas should we expect to count? (Silver has a specific gravity of 10.5, and 
atomic mass of 108.) 

14 . 12 *** [Computer] In quantum scattering theory, the differential cross section is equal to the 
absolute value squared of a complex number f(0), called the scattering amplitude: 

~ = \fm 2 - (14.57) 

dil 

The scattering amplitude can in turn be written as an infinite series 

fm = - jr(2l + 1) e ih sinS* P t { cos0). (14.58) 

p e=o 

Here h = 1.05 x 10 _34 J • s is called “h bar” and is Planck’s constant divided by 2n, and p is the 
momentum of the incident projectile. The real numbers S £ are called the phase shifts and depend on 
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the nature of the projectile and target and on the incident energy. And P t (cos 9 ) is the so-called Legendre 
polynomial. (P 0 = 1 , Pi = cos (9, etc.) 

The partial-wave series (14.58) is especially useful at low energies where only a few of the phase 
shifts are different from zero, (a) Write down this series for the case of 10-MeV neutrons (mass 
m = 1.675 x 10“ 27 kg) scattering off a certain heavy nucleus for which <5 0 = —30°, = 150°, and 

all other phase shifts are negligible, (b) Find an expression for the differential cross section (in terms 
of h,m, the incident energy E, and the two nonzero phase shifts) and plot it for 0 < (9 < 180°. (c) Find 
the total scattering cross section. 


section 14.5 Calculating the Differential Cross Section 

14.13 ** In deriving the cross section for scattering by a hard sphere, we used the “law of reflection,” 
that the angles of incidence and reflection of a particle bouncing off a hard sphere are equal, as in 
Figure 14.10. Use conservation of energy and angular momentum to prove this law. (The definition 
of “hard-sphere scattering” is that a projectile bounces with its kinetic energy unchanged. That the 
force is spherically symmetric implies, as usual, that angular momentum about the sphere’s center is 
conserved.) 

14.14 *★ One can set up a two-dimensional scattering theory, which could be applied to puck projectiles 
sliding on an ice rink and colliding with various target obstacles. The cross section a would be the 
effective width of a target, and the differential cross section do jdd would give the number of projectiles 
scattered in the angle dO. (a) Show that the two-dimensional analog of (14.23) is do/dO = \db/d9\. 
(Note that in two-dimensional scattering it is convenient to let 0 range from — n to tz.) (b) Now consider 
the scattering of a small projectile off a hard “sphere” (actually a hard disk) of radius R pinned down 
to the ice. Find the differential cross section. (Note that in two dimensions, hard “sphere” scattering 
is not isotropic.) (c) By integrating your answer to part (b), show that the total cross section is 2 R as 
expected. 

14.15 *+* [Computer] Consider a point projectile moving in a fixed, spherical force whose potential 
energy is 


u(r) = 


-U 0 (0 <r<R) 

0 (R<r) 


(14.59) 


where U 0 is a positive constant. This so-called spherical well represents a projectile which moves 
freely in either of the regions r < R and R < r, but, when it crosses the boundary r = R, receives 
a radially inward impulse that changes its kinetic energy by ±U 0 (+U 0 going inward, —U 0 going 
outward), (a) Sketch the orbit of a projectile that approaches the well with momentum p Q and impact 
parameter b < R.(b) Use conservation of energy to find the momentum p of the projectile inside the 
well (r < R). Let £ denote the momentum ratio £ = pjp and let d denote the projectile’s distance of 
closest approach to the origin. Use conservation of angular momentum to show that d = £6. (c) Use 
your sketch to prove that the scattering angle 6 is 


6 = 2 


( . b 

arcsin-arcsin 

V R R 




(14.60) 


This gives 9 as a function of b, which is what you need to get the cross section. The relation depends 
on the momentum ratio £, which in turn depends on the incoming momentum p 0 and the well depth 
U 0 . Plot 9 as a function of b for the case that £ = 0.5. (d) By differentiating 9 with respect to b, find an 
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expression for the differential cross section as a function of b, and make a plot of da/dQ. against 0 for 
the case that £ = 0.5. Comment. [Hint: To plot as a function of 9 you don’t need to solve for b in terms 
of 9; instead, you can make a parametric plot of the point ( 9 , da/dQ) as a function of the parameter 
b running from 0 to R.] (e) By integrating do/d£t over all directions, find the total cross section. 

section 14.6 Rutherford Scattering 

14.16 * One of the specific predictions of Rutherford’s model of the atom was that the cross section 
should be inversely proportional to E 2 or, equivalently, to v 4 . To test this, Geiger and Marsden varied 
the speed v by passing the incident alphas through thin sheets of mica to slow them down. According 
to Rutherford’s prediction, the product of N sc and v 4 should be the same whatever the incident speed 
(provided all other variables were held constant). Add a row showing N sc v 4 to this table of their data 
and see how well Rutherford’s prediction was confirmed. 

Number of mica sheets 0 1 2 3 4 5 6 

Counts, N sc (per min) 24.7 29 33.4 44 81 101 255 

Speed, v (arbitrary units) 1 0.95 0.90 0.85 0.77 0.69 0.57 

14.17 * Another specific prediction of Rutherford’s model of the atom was that the cross section should 
be proportional to the nuclear charge squared, that is, to Z 2 , where Z is the atomic number, the number 
of protons in the nucleus. To test this, Geiger and Marsden counted the number of scatterings off various 
different targets (holding all other variables fixed), with the following results: 

Target Gold Platinum Tin Silver Copper Aluminum 

N sc 1319 1217 467 420 152 26 

Z 79 78 50 47 29 13 

Add a row to this table to show the ratio N sc /Z 2 and see how well Rutherford’s prediction was 
confirmed. (At the time the atomic number was not known with certainty, nor was it well understood. 
Rutherford had guessed, correctly, that the nuclear charge was roughly equal to half the atomic mass, 
and this is what they used in place of Z.) The relatively poor agreement for the case of aluminum is 
probably due to our neglect of the target recoil, which is more important for the lighter targets.) 

14.18 ** One the first observations that suggested his nuclear model of the atom to Rutherford was that 
several alpha particles got scattered by metal foils into the backward hemisphere, n/2 <9 < n — an 
observation that was impossible to explain on the basis of rival atomic models, but emerged naturally 
from the nuclear model. In an early experiment, Geiger and Marsden measured the fraction of incident 
alphas scattered into the backward hemisphere off a platinum foil. By integrating the Rutherford cross 
section (14.33) over the backward hemisphere, show that the cross section for scattering with 9 > 90° 
should be 4na 0 (E). Using the following numbers, predict the ratio N sc (9 > 90°)/A inc : thickness of 
platinum foil ~ 3 /xm, density = 21.4 gram/cm 3 , atomic weight =195, atomic number = 78, energy of 
incident alphas = 7.8 MeV. Compare your answer with their estimate that “of the incident a particles 
about 1 in 8000 was reflected” (that is, scattered into the backward hemisphere). Small as this fraction 
is, it was still far larger than any rival model of the atom could explain. 

14.19 ** An important simplification in our derivation of the Rutherford cross section was that the 
projectile’s orbit is symmetric about the direction u of closest approach. (See Figure 14.11.) Prove that 
this is true of almost any conservative central force, as follows: (a) Assume that the effective potential 
(actual plus centrifugal) behaves as in Figure 8.4; that is, it approaches zero as r -> oo and approaches 
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+oo as r 0. 13 Use this to prove that any projectile that comes in from infinity must reach a minimum 
value r min and then move out to infinity again, (b) This implies that a projectile must visit any value 
of r between r min and infinity exactly twice, once on the way in and again on the way out. Prove that 
the values of r at these two points are equal and opposite, and that the values of rjs are equal (where 
■ft is the polar angle defined in Figure 14.11). (c) Use these results to prove that the orbit is symmetric 
under reflection about the direction u. 


14.20 ** The derivation of the Rutherford cross section was made simpler by the fortuitous cancellation 
of the factors of r in the integral (14.30). Here is a method of finding the cross section which works, 
in principle, for any central force field: The general appearance of the scattering orbit is as shown in 
Figure 14.11. It is symmetric about the direction u of closest approach. (See Problem 14.19.) If \[r is the 
projectile’s polar angle, measured from the direction u, then i/r -* ±t/r 0 as f -* ±oo and the scattering 
angle is 9 = n — 2i fr 0 . (See Figure 14.11 again.) The angle \{r 0 is equal to / \jt dt taken from the time 
of closest approach to oo. Using the now-familiar trick you can rewrite this as / (i Jr / r) dr. Next rewrite 
\jr in terms of the angular momentum i and r, and rewrite r in terms of the energy E and the effective 
potential t/ eff defined in Equation (8.35). Having done all this you should be able to prove that 


Jr min V ' Z (b/r) 2 - U(r)/E 


(14.61) 


Provided this integral can be evaluated, it gives 9 in terms of b, and hence the cross section (14.23). 
For examples of its use, see Problems 14.21, 14.22, and 14.23. 

14.21 ** Use the general relation (14.61) from Problem 14.20 to rederive the relation (14.24) for 
scattering by a hard sphere. 

14 . 22 *** Use the general relation (14.61) from Problem 14.20 to rederive the relation (14.31) for 
Rutherford scattering. 

14 . 23 *** Consider the scattering of a particle with energy £ by a fixed, repulsive 1/r 3 force field, 
with potential energy U = y/r 2 . Use the relation (14.61) from Problem 14.20 to find 9 in terms of b 
and hence show that the differential cross section is 


do _ y tt 2 (jt — 9) 
dQ~~E 9 2 (2n -9) 2 sin9 ' 


(14.62) 


To refresh your memory as to how to find r mjn you might look at Figure 8.5 (the case E > 0). You 
should be able to solve your equation for 9 in terms of b to get b in terms of 9 and thence the cross 
section. 


section 14.8 Relation of the CM and Lab Scattering Angles* 

14.24 ** Consider the scattering of two particles of equal mass (for example, scattering of protons off 
protons). In this case, 9 lah = \9 cm . (See Problem 14.27.) (a) Use this result in (14.45) to prove that 


13 Although this behavior is definitely the norm, there are a few force fields for which it is not true. If the actual 
potential energy is strongly attractive near r = 0 (for example, U (r) = — 1/r 3 ), then it dominates the centrifugal 
potential near r — 0, and the effective potential does not approach +oo as r -»■ 0. Our argument also breaks down 
for the special case that b = 0, in which case the projectile may smash directly into the target. 
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when the projectile and target have equal masses 

(i4 - 63) 

(b) Write down the lab cross section for the scattering of two equal-mass hard spheres. (We know that 
in the CM frame, the differential cross section is R 2 /4 where R = + R 2 .) By integrating over all 

directions, verify that the total cross section in the lab frame is 7tR 2 , as it has to be. 

14.25 ** Using Figure 14.15, prove Equation (14.53), that 


tan 0 lab 


sin 9 cm 
X + cos 9 cm 


(14.64) 


14.26** Using suitable trig identities, rewrite (14.64) to give cos 6> lab as a function of cos # crn , and 
verify the derivative (14.55) that was essential for finding the relation between the CM and lab cross 
sections. 


14.27 ** (a) By setting the mass ratio X = 1 in Equation (14.64) of Problem 14.25, prove that in equal- 
mass scattering, 0 lab = ± d cm . (b) Redraw Figure 14.15 for the case that X = 1 (m l = m 2 ) and explain 
why the maximum value of # lab is 7r/2. 

14.28 ** It is often interesting to know about the momentum of the recoiling target particle in the lab 
frame. Let us denote by £ lab the recoil angle, defined as the angle between the recoil momentum p' lab2 
and the incident direction (the angle below the dashed line in Figure 14.14). (a) Show that in Figure 
14.15 the recoil momentum is represented by the vector DC. Deduce that £ lab — (jt — 9 cm )/ 2. (b) Show 
that, in the special case of equal masses = m 2 ), Ci ab + # lab = jt/2; that is, the angle between the 
two outgoing particles in an equal-mass, elastic collision is 90°. (c) Prove this last result directly using 
just conservation of momentum and energy (in an elastic collision). 

14.29 ** An elastic collision is defined as one in which the total kinetic energy of the two particles is 
the same before and after the collision, (a) Show that in the CM frame, the individual kinetic energies 
of the two particles are separately conserved in an elastic collision, (b) Explain clearly why the same 
result is obviously not true in the lab frame. (Think about the energy of the target particle.) (c) Let 
Aj E denote the energy gained by the target particle in the collision (and hence the energy lost by 
the projectile). Using Figure 14.15, show that the fractional energy lost by the projectile (in the lab 
frame) is 


A E 
E 


4X 

(1 + X) 2 


sin 2 ((9 cm /2) 


where, as usual, X is the mass ratio m 1 /m 2 . (Note that in Figure 14.15 the line DC represents the 
recoil momentum of the target.) (d) For a given mass ratio X, what sort of collision gives the largest 
fractional energy loss? What value of X maximizes this energy loss? (Your answer is important in 
situations where one wants a particle to lose energy as quickly as possible — as in a nuclear reactor, 
for example.) 

14.30 *** If you have not already done so, do Problem 14.24. (a) Now consider the scattering of two 
equal-mass hard spheres, A and B, with B initially stationary. Write down the standard expression 
(14.42) for the number of projectiles A scattered into a solid angle dCl at a chosen angle 0. Call this 
number N (A into d Q at 0). Now suppose that we monitor for the number of target particles B recoiling 
into the same solid angle dd at the same angle 0. Find N(B into dQ, at 0), the number of B\ that 
will be observed. How does this compare with the number of A’s? 
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14.31 *** [Computer] Consider the elastic scattering of a projectile that is heavier than the target, that 
is, mi > m 2 or A > 1. (a) Draw the analog of Figure 14.15 for this case. Show clearly that there are 
two different values of the CM angle 0 cm corresponding to each value of # lab . (b) What are the two 
CM angles that correspond to 0 lab = 0? In terms of this example, explain why there is this two-fold 
ambiguity in 0 cm when m 1 >m 2 .(c) Plot (9 lab as a function of 6 cm for the case that A = 2. (d) Use your 
picture from part (a) to find an expression for the maximum possible value of d iab for a given value of 
X. Check that your answer is correct for the case that A = 1. 




CHAPTER 


1 


Special Relativity 


From its publication in 1687 until 1905, Newtonian mechanics reigned supreme. It was 
applied to more and more systems, almost always with complete success. In those rare 
instances where Newtonian ideas appeared to fail, it was found that some complication 
had been overlooked, and, when this complication was included, Newton could again 
account for all the observations. 1 Newton’s formulation was supplemented with new 
ideas (such as the notion of energy) and recast in different guises (by Lagrange and 
Hamilton), but the foundations seemed unshakeable. Then, toward the end of the 
nineteenth century, a few observations were made that seemed inconsistent with the 
classical, Newtonian, ideas. Heroic efforts were made to bring these observations into 
line with classical physics, but in 1905, Albert Einstein (1879-1955) published his first 
paper on the theory that we now call relativity, in which he showed that particles with 
speeds approaching the speed of light require a completely new form of mechanics, 
as I describe in this chapter. Even at slower speeds, Newtonian mechanics is only 
an approximation to the new “relativistic mechanics,” but the difference is usually 
so small as to be undetectable. In particular, at the speeds usually encountered on 
earth, Newtonian mechanics is completely satisfactory, which explains why it is still 
a crucial and interesting part of physics (and justifies the other 15 chapters of this 
book). 2 


1 Perhaps the greatest such triumph for Newton was the prediction and discovery of the planet 
Pluto: Calculations of the orbit of Uranus (taking account of the other known planets, and based, of 
course, on Newtonian mechanics) disagreed with the observed position by some 1.5 minutes of arc. 

In 1846, it was shown independently by the English astronomer John Couch Adams (1819-1892) 
and the Frenchman Urbain Leverrier (1811-1877) that this discrepancy could be explained by the 
presence of a hitherto unnoticed planet outside the orbit of Uranus. Within a few months, the new 
planet, now called Pluto, was discovered by the German Johann Galle (1812-1910) at exactly its 
predicted position. 

2 In writing this chapter on relativity (particularly in the opening sections and the problems), it 
was sometimes difficult to resist borrowing ideas from the relativity chapters of Modern Physics, by 
Chris Zafiratos, Michael Dubson, and myself (second edition, Prentice Hall, 2003). I am grateful to 

Prentice Hall for giving me permission to do so. 595 
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15.1 Relativity 


Let us first consider the significance of the name “relativity.” A moment’s thought 
should convince you that most physical measurements are made relative to a chosen 
reference system. That the position of a particle is r = (*, y, z) means that its position 
vector has components ( x,y,z ) relative to some chosen origin and a chosen set of 
axes. That an event occurs at time t — 5 s means that t is 5 seconds relative to a 
chosen origin of time, t = 0. If we measure the kinetic energy T of a car, it makes a 
big difference whether T is measured relative to a reference frame fixed on the road or 
to one fixed in the car. Almost all measurements require the specification of a reference 
frame, relative to which the measurement is to be made, and we can refer to this fact 
as the relativity of measurements. 

The theory of relativity is the study of the consequences of the relativity of 
measurements. At first thought, this would seem unlikely to be a very interesting 
topic, but Einstein showed that a careful study of how measurements depend on the 
choice of coordinate system revolutionizes our whole conception of space and time, 
and requires a complete rethinking of Newtonian mechanics. 

Einstein’s relativity is really two theories. The first, called special relativity, is 
“special” in that it focuses primarily on unaccelerated frames of reference. The second, 
called general relativity, is “general” in that it includes accelerated reference frames. 
Einstein found that the study of accelerated frames leads naturally to a theory of 
gravitation, and general relativity turns out to be the relativistic theory of gravity. 
In practice, general relativity is required only in situations where its predictions 
differ appreciably from those of Newtonian gravity. These include the study of the 
intense gravity of black holes, of the large-scale universe, and of the effect of the 
earth’s gravity on the extremely accurate time measurements needed for the global 
positioning system. In nuclear and particle physics, where we consider particles that 
move near the speed of light, but where gravity is usually completely negligible, 
special relativity is normally all that is needed. In this chapter, I shall treat only the 
special theory of relativity. 3 


15.2 Galilean Relativity 


Many of the ideas of relativity are present in classical physics, and we have in fact 
met several in earlier chapters. Let us review these ideas and recast some of them in 
a form more suitable for our discussion of Einstein’s relativity. 

As we discussed in Chapter 1, Newton’s laws hold in many different reference 
frames, namely, the so-called inertial frames, any one of which moves at constant 
velocity relative to any other. We can rephrase this to say that, in classical physics, 


3 To cover general relativity would require another book. Some good references are: R. Geroch, 
General Relativity from A to B, University of Chicago Press, 1978; I. R. Kenyon, General Relativity, 
Oxford University Press, 1990; B.F.A.Schutz, A First Course in General Relativity, Cambridge 
University Press, 1985; and James B. Hartle, Gravity: An Introduction to Einstein’s General 
Relativity, Addison Wesley, 2003. 
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Newton’s laws are invariant (that is, unchanged) as we transfer our attention from 
one inertial frame to another. The classical transformation from one frame to a second, 
moving at constant velocity relative to the first, is called the Galilean transformation, 
so a compact way to say the same result is that Newton’s laws are invariant under the 
Galilean transformation. Let us first review this claim. 


The Galilean Transformation 

For simplicity, consider first two frames S and S' that are oriented the same way; that 
is, the x' axis is parallel to the x axis, y' parallel to y, and z' parallel to z. Suppose 
further that the velocity V of S' relative to S is along the x axis. It was a fundamental 
assumption of Newtonian mechanics that there is a single universal time t. Thus if the 
observers in S and S' agree to synchronize their clocks (and to use the same unit of 
time), then t' = t. Finally, we can choose our origins O and O' so that they coincide 
at the time t = t' — 0. This configuration is illustrated in Figure 15.1, where S is a 
frame fixed to the ground. (We’ll assume that a frame fixed to the earth is inertial — 
that is, we’ll ignore the slow rotation of the earth.) The frame S' is fixed in a train that 
is traveling with velocity V along the x axis. 

Consider now some event, such as the explosion of a small firecracker. As measured 
by observers in S this occurs at position r = (x, y, z) and time t; as measured in S' 
it occurs at r' = (x\ /, z') and time t'. Our first (and very simple) task is to establish 
the mathematical relation between the coordinates ( x,y,z,t) and (x\ /, z', t'). A 
moment’s inspection of Figure 15.1 should convince you that x' = x - Vt, and that 
y' = y and z' = z. By the classical assumption concerning time, t' = t, so the required 
relations are 

x' = x — Vt 1 



t' - t. 


These four equations are called the Galilean transformation. They give the coor¬ 
dinates {x, y', z', t') of any event as measured in S' in terms of the corresponding 
coordinates (x, y, z, t) of the same event as measured in S. They are the mathematical 
expression of the classical ideas about space and time. 
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Figure 1 5. 1 The frame S is fixed to the ground, while S' is fixed in a railroad 
car traveling with constant velocity V in the x direction. The two origins 
coincide, O = O', at time t — t' = 0. The star indicates an event, such as a 
small explosion. 
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The Galilean transformation (15.1) relates the coordinates measured in two frames 
arranged with corresponding axes parallel and with relative velocity along the x axis, 
as shown in Figure 15.1 — an arrangement we can call the standard configuration. 
This is not, of course, the most general configuration. For example, if the relative 
velocity V is in an arbitrary direction, it is easy to see that (15.1) can be rewritten 
compactly as 


r' = r — \t and tf = t. (15.2) 

This is still not the most general form of the Galiliean transformation, since we could 
rotate the axes, so that corresponding axes were no longer parallel, and we could 
displace the origins O or O' and the origins of time. However, (15.2) is general enough 
for our present purposes. 

Using the Galilean transformation (15.2) we can immediately relate the velocities 
of an object, as measured in the two frames. If \(t) = r(t) is the velocity of the object 
as measured in S and v'(t) is likewise for S' then by differentiating (15.2) we find 
immediately that (remember that V is constant) 

v' = v-V. (15.3) 

This is the classical velocity-addition formula, which asserts that, according to the 
ideas of classical physics, relative velocities add (or subtract) according to the normal 
rules of vector arithmetic. 


Galilean Invariance of Newton’s Laws 

To prove the invariance of Newton’s laws under the Galilean transformation, suppose 
that the second law holds in frame §; that is, that F = ma, with all three variables 
measured in §. Now it is an experimental fact (at least in the domain of classical 
mechanics) that measurements of the mass of any object give the same results in all 
inertial frames. Thus the mass m' measured in S' is the same as that measured in 
S, and m' = m. The proof that the same is true for the net force depends, to some 
extent, on one’s definition of force. If we take the view that forces are defined by the 
readings on spring balances, then it is clear that the force F' measured in S' is the 
same as that measured in S, and F' = F. Finally, differentiating (15.3) with respect 
to time (and remembering that V is constant, by assumption) we see that a' = a. We 
have now proved that each of the variables F', m', and a' of frame S' is equal to the 
corresponding variable F, m, and a of frame S'. Therefore, if it is true that F = ma, it 
is also true that F' = m'a'. That is, Newton’s second law is invariant under the Galilean 
transformation. I leave it as an exercise (Problem 15.1) to prove that the same is true 
of the first and third laws. The invariance of the laws of mechanics under the Galilean 
transformation was known to Galileo, who used it to argue that no experiment could 
tell whether the earth was “really” moving or “really” at rest, and hence that Kepler’s 
sun-centered view of the solar system was just as reasonable as the traditional earth- 
centered view. 
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Galilean Relativity and the Speed of Light 

While Newton’s laws are invariant under the Galilean transformation, the same is not 
true of the laws of electromagnetism. Whether we write them in their compact form as 
Maxwell’s four equations, or in their original form (as Coulomb’s law, Faraday’s law, 
and so on), they can be true in one inertial frame, but if they are, and if the Galilean 
transformation were the correct relation between different inertial frames, then they 
could not be true in any other inertial frame. By far the quickest way to verify this 
claim is to recall that Maxwell’s equations imply that light (and, more generally, any 
electromagnetic wave) propagates through the vacuum in any direction with speed 


: - 3.00 x 10 8 m/s, 


(15.4) 


where e 0 and /z 0 are the permittivity and permeability of the vacuum. Thus if 
Maxwell’s equations hold in frame §, then light must travel at the same speed c 
in any direction, as measured in S. But now consider a second frame S', traveling 
at speed V along the jc axis of S, and imagine a beam of light traveling in the same 
direction. In S the light’s speed is v = c. Therefore, in S' its speed is given by the 
classical velocity-addition formula (15.3) as 


v' 


c-V, 


as shown on the left of Figure 15.2. Similarly, a beam of light traveling to the left will 
have speed v = c in S, but v' = c + V in S'. Depending on its direction, any beam of 
light will have speed v' (as measured in S') that varies anywhere between c — V and 
c + V. Therefore, Maxwell’s equations cannot hold in the inertial frame S'. 


>8 

V = c 
v'=c-V 



Figure 15.2 Two frames S and §' in the standard configuration with relative velocity 
V. Two beams of light approach the car from opposite directions. If, as measured 
in S, the light has speed c in either direction, then the classical velocity-addition 
formula implies that, as measured in S', it has speed c — V traveling to the right, 
and c + V traveling to the left. 


If the Galilean transformation were the correct transformation between inertial 
frames, then although Newton’s laws would hold in all inertial frames, there could 
only be one frame in which Maxwell’s equations hold. This supposed unique frame, 
in which light would travel at the same speed in all directions, is sometimes called the 
ether frame. 4 


4 The origin of the name is this: It was assumed that light must propagate through a medium, in 
much the same way that sound travels through the air. Since no one had ever detected this medium 
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The Michelson-Morley Experiment 

The state of affairs just described, with the laws of mechanics valid in all inertial 
frames, but the laws of electromagnetism valid in a unique frame, was well under¬ 
stood toward the end of the nineteenth century. It was regarded by some (most notably 
Einstein) as unpleasing, and it was eventually shown by Einstein to be wrong. Nev¬ 
ertheless, it was logically consistent, and most physicists took for granted that there 
could be only one frame in which the speed of light had the same value c in all di¬ 
rections. Since the earth travels at a considerable speed in a continually changing 
direction around the sun, it seemed obvious that the earth must spend most of its time 
moving relative to the ether frame and hence that the speed of light as measured on 
earth should be different in different directions. The effect was expected to be very 
small. (The earth’s orbital speed is V a* 3 x 10 4 m/s, large by terrestrial standards, 
but very small compared to c = 3 x 10 8 m/s. Thus the fractional variation, between 
c — V and c + V, was expected to be very small.) Nevertheless, in 1880, the Amer¬ 
ican physicist Albert Michelson (1852-1931), later assisted by the chemist Edward 
Morley (1838-1923), devised an interferometer that should have easily detected the 
expected differences in the speed of light. To their surprise and dismay they found 
absolutely no variation. 

Their experiments, and many different experiments with the same objective, have 
been repeated and have never found any reproducible evidence of variations in the 
speed of light relative to the earth. With hindsight, it is easy to draw the right 
conclusion: Contrary to all expectations, the speed of light is the same in all directions 
relative to an earth-based frame, even though the earth has different velocities at 
different times of year. In other words, it is not true that there is only one frame in 
which light has the same speed in all directions. 

This conclusion is so surprising that is was not taken seriously for twenty years. 
Instead, several ingenious theories were advanced to explain the Michelson-Morley 
result while preserving the idea of a unique ether frame. For example, the so-called 
ether-drag theory held that the ether — the medium through which light was sup¬ 
posed to propagate — was dragged along with the earth, in much the same way the 
atmosphere is dragged along. This would imply that earth-bound observers are at 
rest relative to the ether and should measure the same speed of light in all directions. 
However, the ether-drag theory was incompatible with the phenomenon of stellar aber¬ 
ration. 5 None of these alternative theories was able to explain all of the observed facts 
(at least, not in a reasonable and economical way), and today almost all physicists 
accept that there is no unique ether frame and that the speed of light is a universal 
constant, with the same value in all directions in all inertial frames. The first person to 


and since light could travel through seemingly empty space, the medium clearly had most unusual 
properties, and was named “ether” after the Greek for the stuff of the heavens. The “ether frame” 
was the frame in which the supposed ether was at rest. 

5 The ether-drag theory would require that light entering the earth’s envelope of ether would be 
bent. This would contradict stellar abberation, in which the apparent position of any one star moves 
around a small circle as the earth moves around its circular orbit — in a way that makes clear that 
the light from the star travels in a straight line as it approaches the earth. 
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accept this surprising idea whole-heartedly was Einstein, as we now discuss. In par¬ 
ticular, we shall see that the universality of the speed of light requires us to reject the 
Galilean transformation and the classical picture of space and time on which it was 
based. This, in turn, will require us to modify much of our Newtonian mechanics. 


15.3 The Postulates of Special Relativity 


The special theory of relativity is based on the acceptance of the universality of the 
speed of light, as suggested by the Michelson-Morley experiment. 6 Einstein proposed 
two postulates, or axioms, expressing his conviction that all the laws of physics should 
hold in all inertial frames, and from these postulates, he developed his special theory 
of relativity. 

Before we discuss the postulates of relativity, it would be good to agree on what 
we mean by an inertial frame: 


Definition of an Inertia! Frame 

An inertial frame is any reference frame (that is, a system of coordinates x, y, z 
and time t) in which all the laws of physics hold in their usual form. 


Notice that I have not yet specified what “all the laws of physics” are. Following 
Einstein, we shall use the postulates of relativity to help us decide what the laws 
of physics could be. (As always, the ultimate test will be whether they agree with 
experiment.) It will turn out that one of the classical laws that carries over into 
relativity is the law of inertia, Newton’s first law. Thus our newly defined inertial 
frames are in fact the familiar “unaccelerated” frames, where an object subject to no 
forces travels with constant velocity. As before, a frame fixed to the earth is (to a good 
approximation) inertial; a frame fixed to an accelerating rocket or a spinning turntable 
is not. The big difference between the inertial frames of relativity and those of classical 
mechanics is the mathematical relation between different frames. In relativity, we 
shall find that the classical Galilean transformation must be replaced by the so-called 
Lorentz transformation. 

Notice also that I have specified that an inertial frame is one where the physical laws 
hold “in their usual form.” As we saw in Chapter 9, one can sometimes modify physical 
laws so that they hold in noninertial frames as well. (For example, by introducing the 
centrifugal and Coriolis forces, we could use Newton’s second law in a rotating frame.) 
It is to exclude such modifications that I added the qualifier “in their usual form.” 


6 Whether Einstein actually knew about the Michelson-Morley result when he was formulating 
his theory is not clear. There is some evidence that he did, but it seems clear that his main motivation 
was the conviction that Maxwell’s equations should hold in all inertial frames. Whether he knew or 
not affects neither Einstein’s amazing accomplishment nor the importance of the Michelson-Morley 
result as beautifully clear evidence in favor of Einstein’s assumptions. 
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The first postulate of relativity asserts the existence of many different inertial 
frames, traveling at constant velocity relative to one another: 


First Postulate of Relativity 

If S is an inertial frame and if a second frame S' moves with constant velocity 
relative to S, then S' is also an inertial frame. 


Another way to say this is that the laws of physics are invariant as we transfer our 
attention from one frame to a second one moving at constant velocity relative to the 
first. This is what we proved for the laws of mechanics, but we are now claiming it 
for all the laws of physics. 

Another popular statement of the first postulate is that “there is no such thing as 
absolute motion.” To understand this, consider two frames, S attached to the earth 
and S' attached to a rocket coasting at constant velocity relative to the earth. A natural 
question is whether there is any meaningful sense in which we could say that S is 
really at rest and S' is really moving (or vice versa). If the answer were “yes,” then 
we could say that S is absolutely at rest and that anything moving relative to S is in 
absolute motion. However, this would contradict the first postulate of relativity: All 
of the laws observable by scientists in S are equally observable by scientists in S'; any 
experiment that can be performed in S can equally be performed in S'. Therefore, no 
experiment can show which frame is really moving. Relative to the earth, the rocket 
is moving; relative to the rocket, the earth is moving; and this is all we can say. 

Yet another statement of the first postulate is that among all the inertial frames, 
there is no preferred frame. The laws of physics single out no one frame as being in 
any way more special than any other. 

The second postulate specifies one of the laws that holds in all inertial frames: 


Second Postulate of Relativity 

The speed of light (in vacuum) has the same value c in every direction in all 
inertial frames. 


This is, of course, the Michelson-Morley result. 

Although the second postulate flies in the face of our everyday experience, it is by 
now a firmly established experimental fact. As we explore the consequences of Ein¬ 
stein’s postulates we are going to encounter several surprising predictions, all of which 
seem to contradict our experience (for example, the phenomenon called time dilation, 
described in the next section). If you have difficulty accepting these predictions, there 
are two points to bear in mind: First, they are all logical consequences of the sec¬ 
ond postulate. Thus, once you have accepted the latter (surprising, but indisputably 
true), you have to accept all of its logical consequences, however counterintuitive they 
may seem. Second, all of these surprising phenomena (including the second postu¬ 
late itself) have the subtle property that they become important only when objects 
travel with speeds comparable to the speed of light. In everyday life, with all speeds 
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much less than c, these phenomena simply do not show up. In this sense, none of 
the surprising consequences of Einstein’s postulates really conflict with our everyday 
experience. 


15.4 The Relativity of Time; Time Dilation _ 

Measurement of Time in a Single Frame 

We are going to find that the second postulate forces us to abandon the classical notion 
of a single universal time. Instead, we shall find that the time of any one event, as 
measured in two different inertial frames, is in general different. This being the case, 
we need first to be quite clear what we mean by time, as measured in a single frame. 

I shall take for granted that we have at our disposal lots of reliable tape measures 
and clocks. The clocks need not be identical, but they must have the property that, 
when brought together at the same point, at rest in the same inertial frame, they agree 
with one another. Let us now consider a single inertial frame S, with origin O. We can 
station a chief observer at O with one of our clocks, and she can easily time any nearby 
event, such as a small explosion, since she will see it essentially instantaneously. To 
time an event farther away from the origin is harder, since light from the event has to 
travel to O before she can sense it. If she knew how far away the event occurred, then 
she could calculate how long the signal took to reach her (she knows that light travels 
at speed c ) and subtract this from the time of arrival to give the time of the event. A 
simpler way to proceed (in principle anyway) is to employ a large number of helpers 
stationed at regular intervals throughout the region of interest and each with his own 
clock. The helpers can measure their distances from O, and we can check that their 
clocks are synchronized with the clock at O by having the chief observer send out a 
light signal at an agreed time (on her clock). Each helper can calculate the time taken 
by the signal to reach him and (allowing for this transit time) check that his clock 
agrees with the clock at O. 

With enough helpers, stationed closely enough together, there will be a helper 
close enough to any event to time it essentially instantaneously. Once he has timed 
it, he can, at his leisure, inform everyone else of the result by any convenient means 
(such as a telephone). In this way, any event can be assigned a unique and well-defined 
time t as measured in the frame 8. In what follows, I shall assume that any inertial 
frame 8 comes with a set of rectangular axes Oxyz and a team of helpers stationed at 
rest throughout 8 and equipped with synchronized clocks. This allows us to assign a 
position (x,y,z) and a time t to any event, as observed in the frame 8. 

Time Dilation 

Let us now compare measurements of times made by observers in two different inertial 
frames. Consider our familiar two frames, 8 anchored to the ground and S' traveling 
with a train in the x direction at speed V relative to 8. We now examine a thought 
experiment (or gedanken experiment, from the German) in which an observer on 
the train sets off a flashbulb on the floor of the train. The light travels to the roof, where 
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Figure 15.3 (a) The thought experiment as seen in frame S'. The light travels 

straight up and down again, and the flash and beep occur at the same place, (b) As 
seen in S, the flash and beep are separated by a distance V At. Notice that in S two 
observers are needed to time the two events, since they occur in different places. 


it is reflected back and returns to its starting point, where it strikes a photocell and 
causes an audible “beep.” We wish to compare the times. At and At', as measured 
in the two frames, between the flash as the light leaves the floor and the beep as it 
returns. 

As seen in the frame S', our experiment is shown in Figure 15.3(a). If the height of 
the train is h, then, as seen in S', the light travels a total distance 2 h at speed c (second 
postulate) and so takes a time 


A , 2 h 

At ' = —. 


(15.5) 


This is the time between the flash and the beep, as measured by an observer in S' 
(provided, of course, his clock is reliable). 

As seen in S, our experiment is shown in Figure 15.3(b). In particular, the same 
beam of light is seen to travel along the two sides A B and BC of a triangle. If A us the 
time between the flash and the beep (as measured in S), the side AC has length V At. 
Thus the triangle ABD has sides 7 h, VAt/2, and cAt/2. (Notice that this is where 
we use the second postulate, that the speed of light is c in either frame.) Therefore, 


(c At/2) =h + (VAt/2) , 


which we can solve to give 

At = - 


2 h 


2 h 


1 


Vc 2 - V 2 C yi - j8 2 
where I have introduced the useful abbreviation 


(15.6) 



which is just the speed V measured in units of c. 


I take for granted that the height of the train is the same in either frame. We’ll prove this shortly. 
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The striking thing about the two results (15.5) and (15.6) is that they are not equal. 
The time between the same two events (the flash and the beep) has different values as 
measured in the two different inertial frames. Specifically, 


At — 



(15.8) 


We derived this result for a thought experiment with a flash of light reflected back to its 
source by a mirror on the ceiling of the railroad car, but the conclusion applies to any 
two events that occur at the same place in the train. Suppose, for instance, an observer 
at rest in S' were to shout “Good” and a moment later “Grief.” In principle, we could 
ignite a flashbulb at the “good,” and arrange a mirror which would reflect the light 
back to arrive at the moment of “grief.” Therefore, the relation (15.8) must apply to 
these two events, the “good” and the “grief.” Since the timing of the two events cannot 
depend on whether we actually did the experiment with the light and the beeper, we 
conclude that the relation (15.8) must apply to any two events that occur at the same 
place in the frame S'. 

You should avoid thinking that the clocks in one of our frames are somehow 
running incorrectly — on the contrary, it was essential to our argument that all clocks, 
in both frames, were running correctly. Further, it makes no difference what particular 
kinds of clock we used, so the conclusion (15.8) applies to all (accurate) clocks. That 
is, time itself, as measured in the two frames, is different in accordance with (15.8). 
As we shall discuss shortly, this surprising conclusion has been verified repeatedly. 

If the frame S' is actually at rest (relative to S), then V = 0, so f = 0, and (15.8) 
reduces to At' — At. That is, there is no difference in the times unless S' is actually 
moving relative to S. Moreover, at normal terrestrial speeds, V c, so f <$C 1 and 
the denominator in (15.8) is very close to one. That is, at the speeds of our everyday 
experience, the two times are very nearly equal — so close that it would be almost 
impossible to detect any difference, as the following example shows. 


example 15.1 Time Differences for a Jet Plane 

Suppose that the pilot of a jet traveling at a steady V = 300 m/s arranges to set 
off a flashbulb at intervals of exactly one hour (as measured in his reference 
frame). If we arrange two observers on the ground to check this, what would 
they measure for the time At between two successive flashes? (Take the ground 
to be an inertial frame; that is, ignore effects of the earth’s rotation.) 

The required interval is given by (15.8) with At' = 1 hour and f> = V/c = 
10~ 6 . So 


At - 


At ' 


lh 

VT - 10~ 12 


«»lhx(l+}x 10~ 12 ) = 1 h + 1.8 x 10“ 9 s 
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I where in going to the second line, I have used the binomial approximation. 8 In 
this experiment, the time difference is less than 2 nanoseconds (1 ns = 10 -9 s). It 
j is not hard to see why classical physicists had failed to detect such differences! 

As we increase V, the difference between the times in (15.8) gets bigger, and if we 
let V approach c, we can make the difference as large as we please. For example, if 
V = 0.99 c, then /I = 0.99 and (15.8) gives At' ~ 7At. Speeds this high are routinely 
achieved by the accelerators at particle-physics labs, and the predicted time difference 
is precisely confirmed. 

If we put V = c (that is, = 1) in (15.8), we would get the absurd result At 1 = 
At/0, and if we put V > c (that is, ft > 1), we would get an imaginary value for At'. 
These results suggest that V must always be less than c, 

V < c, 

a suggestion that proves correct and is one of the most profound results of relativity: 
The relative speed of two inertial frames can never equal or exceed c. That is, the 
speed of light, in addition to being the same in all inertial frames, is also the universal 
speed limit for the relative motion of any two inertial frames. 

The factor 1/^1 — f5 2 occurs so often in relativity, it usually given its own 
name, y. 




(15.9) 


It is useful to remember that this new factor always satisfies y > 1, and as /J 1 (that 
is, V — c) y —► oo. 

In terms of the parameter y the result (15.8) can be written a little more com¬ 
pactly as 

At = y At' > At'. (15.10) 

The asymmetry of this result (that At' is never more than At) seems at first glance to 
violate the postulates of relativity, since it suggests a special role for the frame S' — 
namely, that S' is the special frame in which the time interval is minimum. However, 
this is just as it should be, since in our thought experiment S' is special, because 
it is the frame where the two events in question (the flash and the beep) occur at the 
same place. (This asymmetry was implicit in Figure 15.3, which showed one observer 
measuring At', but two measuring At.yTo emphasize this asymmetry, the time At 1 is 
often renamed A t 0 and (15.10) rewritten as 


At = y At 0 > A f 0 . 


05.11) 


8 This is a nice example of a calculation where one almost has to use the binomial approximation, 
since most calculators cannot tell the difference between 1 and 1 — 10 -12 . 
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The subscript on A t 0 is to emphasize that A t 0 is the time elapsed on a clock at rest in 
the special frame where the two events in question occurred at the same place. This 
time is often called the proper time between the two events. In (15.11), At is the 
corresponding time measured in any frame and is always greater than or equal to the 
proper time A t Q . For this reason, the effect implied by (15.11) is called time dilation 
and can be loosely stated by saying that a moving clock is observed to run slow. As 
measured by observers on the ground, a clock in the moving train is found to run slow. 

Finally, I should emphasize the fundamental symmetry between any two inertial 
frames. We chose to do our thought experiment in a way that gave the frame S' a 
special role. (It was the frame in which the flash and beep occurred at the same place.) 
But we could have done the experiment the other way round, with the flashbulb, mirror 
and beeper at rest on the ground, and in this case, we would have found the opposite 
effect, that At' = y At. The advantage of writing the time-dilation formula in the form 
(15.11) is that it avoids the problem of remembering which is frame S, and which S'; 
the subscript on At 0 always flags the proper time — the time measured in the frame 
in which the two events were at the same place. 


Evidence for Time Dilation 

Time dilation was predicted in 1905 but was not experimentally verified until 1941, 
by B. Rossi and D. B. Hall. 9 The problem was, of course, to get a clock traveling 
sufficiently fast to show a measurable dilation. Rossi and Hall exploited the natural 
clocks that come with unstable subatomic particles, which decay (on average) after 
a definite time, characteristic of the particle. The lifetime of an unstable particle can 
be specified by its half-life, ty 2 , the time in which half of a large number of the 
particles will decay. The muon is an unstable particle that is created in the earth’s upper 
atmosphere when cosmic ray particles (mostly protons and alpha particles) from outer 
space collide with atmospheric atoms. Many of these muons have speeds quite close 
to the speed of light, and they live long enough to find their way down to the earth’s 
surface. The muon had been discovered in 1935 by Carl Anderson in his studies of 
cosmic rays. By 1941 its half-life was known to be about ty 2 = 1.5 /xs, meaning that 
half of a sample of muons at rest would decay in this time. If time dilation is correct, 
the half-life for a moving muon (as measured by earth-bound observers) should be 
larger by the factor y as in (15.11). For example, if the muon had speed 0.8c, then 
y = 1.67, and the muon’s half-life should be 

ty 2 ( at speed 0.8c) = 1.67 x t 1/2 (at rest) = 2.5 /is. 

Rossi and Hall were able to separate out cosmic-ray muons according to their speed 
and they could find their half-lives by measuring how many of them survived the 
journey through the atmosphere. Although their measurements had quite large ex¬ 
perimental errors, they were nonetheless good enough to verify Einstein’s prediction 
(15.11) and to exclude the classical assumption of a single universal time. 


'B. Rossi and D. B. Hall, Physical Review, vol. 59, p. 223 (1941). 
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A test of time dilation using man-made clocks had to await the development of 
superaccurate atomic clocks. In 1971 four portable atomic clocks were synchronized 
with a reference clock at the U. S. Naval Observatory in Washington DC and then flown 
around the world in a jet plane and returned to the Naval Observatory. The observed 
discrepancy between the reference clock and the portable clocks was (273 ± 7) ns 
(averaged over the four clocks) in excellent agreement with the predicted value 
(275 ± 21) ns. 10 

Tests of time dilation — using both the natural clocks of unstable particles and 
man-made atomic clocks — have been repeated with ever-increasing precision, and 
there is now no doubt that the relativity of time, as embodied in (15.11), is true. 
Another important test that is carried out thousands of times every day is the Global 
Positioning System (GPS). This system, which is used by airplanes, ships, cars, and 
hikers to find their positions within a few meters, times the arrival of signals from 
24 GPS overhead satellites at the observer’s receiver and calculates the receiver’s 
position from the known positions of the satellites. To find the position within a few 
meters requires an accuracy of a few nanoseconds, which requires that allowances 
be made for the relativistic differences between the times of the satellite and earth- 
bound reference frames. The success of the GPS is a daily tribute to the correctness 
of relativity. 11 


15.5 Length Contraction 


The postulates of relativity have forced us to the conclusion that time is relative — 
the time between two given events is different when measured in different inertial 
frames — and, even more important, this conclusion is bom out by experiment. This, 
in turn, implies that the length of an object is likewise dependent on the frame in which 
it is measured. To see this, we’ll conduct a second thought experiment with the train 
of Figure 15.3, this time measuring its length. For an observer (let’s call him Q ) on 
the ground (frame S) the simplest procedure is probably to measure the time At for 
the train to pass him and calculate the length as 12 

l = V At. (15.12) 


10 See J. C. Hafele and R. E. Keating, Science, vol. 177, p. 166 (1972). Two trips were made, one 
going west and the other going east, both with satisfactory results. The numbers quoted here are for 
the more decisive westward trip. This experiment was actually a test of general, as well as special, 
relativity, since the predicted discrepancy has an appreciable contribution from gravitational effects. 

11 For a readable account of the large role of relativity in the GPS, see N. Ashby, Physics Today, 
May 2002, p. 41. As described there, there are important contributions from general, as well as 
special, relativity. Thus, the success of the GPS is a test of both theories. 

12 With so many of the familiar classical ideas being questioned, you are entitled to ask if it 
is legitimate to use the classical formula (15.12). However, this is just the definition of velocity 
(velocity = distance/time), and is certainly valid in any one reference frame (as long as we measure 
all quantities in this same frame). 
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To find the length /' of the train as measured in the train’s rest frame, an observer 
on the train could simply use a long tape measure. However, for comparison with 
(15.12), it is convenient to use a different method. We can station two observers on 
the train, one at the front and another at the back, and have them record the times at 
which they pass the observer Q on the ground. The difference At' between these two 
times is the time (as measured in frame S') for the train to pass observer Q, so the 
length of the train (again as measured in S') is just 


l'=VAt'. (15.13) 

Notice that we are making an important assumption here, that the speed of frame S 
relative to S' is the same as the speed V of S' relative to S. (The relative velocities 
are in opposite directions, but their magnitudes are the same.) This is true in classical 
mechanics, and it is also true in relativity, where it follows from the two postulates. 
The details of the argument require some care, but the gist is this: Consider the 
transformation from frame S to S'. We’H denote it by (S -> S') temporarily. Suppose 
that, before making this transformation, we were to rotate our axes through 180°about 
the y (or z) axis, then make the transformation, and then rotate back again. The effect 
of the rotations is to reverse the direction of the x axis (and finally rotate it back again). 
The net effect of all three operations is precisely the transformation (S' -> S). Since 
the rotations certainly don’t change any speeds, we’ve proved that the speed of S' 
relative to S is the same as that of S relative to S'. 

Comparing (15.12) with (15.13), we see that, since the times At and At' are 
unequal, the same has to be true of the lengths / and /'. To quantify the difference, 
we must be careful to get the relation between At and At' the right way around. 
These two times are the times (as measured in S and S') between two events: “front 
of train opposite observer Q” and “back of train opposite observer Q” These two 
events occur at the same place in frame S, so At is the proper time, and At' = y At. 
Inserting this into (15.13) and comparing with (15.12), we see that /' = yl or 

/ = -</'. (15.14) 

Y 

The length of the train as measured in S is less than that measured in S' (unless V = 0). 

Like time dilation, the effect (15.14) is asymmetric, reflecting the asymmetry of 
the experiment. The frame S' is special, since it is the unique frame where the object 
being measured (the train) is at rest. [We could, of course, have done the experiment 
the other way round. If we had measured the length of a building that is at rest on the 
ground, then the roles of / and l’ would have been reversed.] To avoid confusion as to 
which frame is which, it is common to rewrite (15.14) as 



05.15) 
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where l 0 denotes the length of an object measured in the object’s rest frame (the 
frame in which the object is at rest), while / is the length in any frame. The length l 0 
is called the object’s proper length. Since l < l 0 (if V ^ 0), this difference in lengths 
is called the length contraction (or the Lorentz contraction, or Lorentz-Fitzgerald 
contraction, after the two physicists — the Dutch Hendrik Lorentz, 1853-1928, and 
the Irish George Fitzgerald, 1851-1901 —who first suggested there must be some 
such effect.) The result can be loosely paraphrased by saying that a moving body is 
observed to be contracted. 

Like time dilation, length contraction is a real effect, well established by exper¬ 
iment. Since the two effects are so intimately connected, any evidence for one can 
be taken as evidence for the other. In particular, the decay of a high-speed unstable 
particle, when viewed in the particle’s rest frame, can be interpreted as clear evidence 
for length contraction. (See Problem 15.12.) 


Lengths Perpendicular to the Relative Velocity 

The length contraction just derived applies to lengths in the direction of the relative 
velocity, such as the length of a train in the direction of motion. It is easy to see that 
there can be no analogous contraction or expansion of lengths perpendicular to the 
motion, such as the height of the train. Suppose for example there were a contraction 
and imagine two observers Q standing at rest in S and Q' in S'. Suppose further that 
Q and Q' are equally tall (when at rest) and that Q' is holding a knife exactly level 
with the top of his head. If there is a contraction, then as measured by Q, observer 
Q' will be shortened as he rushes past, and Q will be scalped, or worse, as the knife 
goes by. But, unlike our previous thought experiments, this experiment is completely 
symmetric between the two frames: There is just one observer in each frame, and the 
only difference is the direction of the relative velocities. Therefore, it must also be 
that, as seen by Q', it is Q who is contracted; so the knife misses Q, and Q is not 
scalped. The assumption of a contraction has led us to a contradiction and there can 
be no contraction. A similar argument excludes the possibility of expansion, and, in 
fact, the knife just scrapes past Q as seen in either frame. We conclude that lengths 
perpendicular to the relative motion are unchanged. The length-contraction formula 
(15.15) applies only to lengths parallel to the relative velocity. 


15.6 The Lorentz Transformation 


According to the classical notions of space and time, we saw that the mathematical 
relation between coordinates in two inertial frames § and S' is the Galilean transfor¬ 
mation (15.1). In relativity, this cannot be the correct relation. (For example, time 
dilation contradicts the equation t = t'.) However, we can deduce the correct relation 
using an argument similar to the one that we used in connection with Figure 15.1 to 
derive the Galilean result. We imagine two frames, S attached to the ground and S' 
attached to a train moving with speed V relative to S. We imagine, further, the explo¬ 
sion of a firecracker, which leaves a bum mark on the wall of the railroad car at a point 
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measured in § measured in S’ 


Figure 15.4 The coordinate x' is the horizontal distance, measured in S', between 
the origin O' and the bum mark at P'. The distances x and Vt are both measured 
in S at the time t (measured in S) of the explosion. 


P 1 . The coordinates of this explosion are ( x,y,z,t ) as measured by observers in S 
and (x', y', z', t') in S'. Our object is to find formulas for x', y', z, and t' in terms of 
x , y, z, and t. The thought experiment is illustrated in Figure 15.4, which is just like 
Figure 15.1 except that we now know we must be very careful to identify the frames 
(S or S') relative to which the various distances are measured. 

Since lengths perpendicular to the relative velocity are the same in both frames, 
we can immediately write 

y'= y and z! = z (15.16) 

exactly as with the Galilean transformation. The coordinate x' is the horizontal dis¬ 
tance between the origin O' and the bum mark at P', as measured in S'. The same 
distance as measured in S is x — Vt, since x and Vt are the distances from O to P' 
and from O to O' at the instant t of the explosion (measured in S). Therefore, by the 
length-contraction formula (15.15) (x' is the proper length here) 

x - Vt = x’/y 


or 

x' = y(x — Vt). (15.17) 

This is the third of the four equations that we need. Notice that if V < c then y ~ 1 
and (15.17) reduces to the Galilean relation x' — x — Vt. 

Finally, to get an equation for t' we can use a simple trick. We could repeat the 
previous argument with the roles of 8 and S' exchanged. That is, we could let the 
explosion bum a mark at a point P on a wall fixed in S. Arguing as before, we would 
get the result 

x = y(x'+Vt'). (15.18) 

(Notice that we could get this result directly from (15.17) by exchanging the primed 
and unprimed variables and replacing V by —V.) Substituting (15.17) into (15.18), 
we can eliminate x' and solve for t', to give (as you should check) 

t' = y(t -Vx/c 2 ). (15.19) 

This is the required equation for t'. When V <^c, we can neglect the second term and 
y ^ 1, so (15.19) reduces to the Galilean relation t' — t. 
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Collecting together the results (15.16), (15.17), and (15.19), we get the required 
four equations: 

The Lorentz Transformation 

x'~y(x-Vt) | 

: z'Z y \ (15.20) 

f = y(t - Vx/c 2 )< J 


These four equations are called the Lorentz transformation or the Lorentz-Einstein 
transformation, in honor of Lorentz, who first proposed them, and Einstein, who 
first interpreted them correctly. The Lorentz transformation gives the coordinates 
(x', y', z', t') of an event, as measured in S', in terms of its coordinates (x, y, z, t ) 
as measured in S. It is the correct relativistic version of the classical Galilean trans¬ 
formation (15.1). 

If we wanted to know the coordinates ( x,y,z,t ) in terms of (x', y', z', t'), we 
could solve the four equations (15.20), but a simpler way is just to exchange primed 
and unprimed variables and replace V by —V. Either way, the result is the inverse 
Lorentz transformation 


x = y(x' + Vt') 

y — y' 

z = z! 

t = y{t' + Vx'/c 2 ). 

The Lorentz transformation expresses all of the properties of space and time that 
follow from the postulates of relativity. Using it, one can calculate all of the kinematic 
relations between measurements made in different inertial frames. There are several 
examples of its use in the problems at the end of this chapter and here are a couple 
more. 


(15.21) 


example 15.2 Rederiving Length Contraction 

Use the Lorentz transformation to rederive the length contraction formula 
(15.15). (Note that this will not give an alternative derivation of length contrac¬ 
tion, since length contraction was used in deriving the Lorentz transformation. 
Rather we shall just get a consistency check.) 

Consider our usual two frames, S fixed to the ground and S' fixed to a train 
traveling along the x axis with speed V relative to S. We wish to compare the 
lengths of the train as measured in S and S'. The measurement in S' is easy, since 
the train is at rest in this frame. An observer can, at his leisure, measure the x' 
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coordinates x[ and x' 2 of the back and front of the train, and its length is just the 
difference l' = x' 2 — x[. This length is the proper length of the train, so 

l 0 = l' = x' 2 -x' v (15.22) 

The measurement in S is harder since the train is moving. We could, with enough 
care, station two observers Q x and Q 2 beside the track so that the back of the 
train passes Q { at the exact same instant (t ] = t 2 ) that the front passes Q 2 . The 
length as measured in S is then just 

l = x 2 - x x . 

Now, applying the Lorentz transformation (15.20) to the event “front of train 
passes <2 2 ” we g et 

*2 = y( x 2 - v h) 

and, for the event “back of train passes Q x ,” 

x[ = Y(x\- Vh). 

Subtracting and remembering that t 2 = t h we find 

l 0 = x' 2 — x' x = y(x 2 — v L ) — yl 
or l = IJy, which is the length contraction (15.15). 

Our next example is one of the many seeming paradoxes of relativity. 

example 15.3 A Relativistic Snake 

A relativistic snake, of proper length 100 cm, is traveling across a table at 
V = 0.6c. To tease the snake, a physics student holds two cleavers 100 cm apart 
and plans to bounce them simultaneously on the table so that the left one lands 
just behind the snake’s tail. The student reasons as follows: “The snake is moving 
with = 0.6, so its length is contracted by the factor y = 5/4 (check this) and its 
length measured in my frame is 80 cm. Therefore, the cleaver in my right hand 
bounces well ahead of the snake, which is unhurt.” This scenario is shown in 
Figure 15.5. Meanwhile the snake reasons thus: “The cleavers are approaching 
me at = 0.6, so the distance between them is contracted to 80 cm, and I shall 
certainly be cut to pieces when they fall.” Use the Lorentz transformation to 
resolve this paradox. 

Let us choose frames § and S' in the usual way. The student is at rest in §, 
with the cleavers at x L = 0 and x R = 100 cm. The snake is at rest in S', with 
its tail at x' = 0 and its head at x' = 100. To resolve the dispute, we must find 
where and when the two cleavers fall, as observed in S and in S'. 

In S the cleavers fall simultaneously at t = 0. At this time the snake’s tail is 
at x = 0. Since his length is 80 cm, his head has to be at v = 80 cm. [You can 
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x = 80 = 100 


Figure 15.5 The snake paradox, as seen in the student’s frame §. The 
cleavers fall simultaneously at time t — 0. 


check this, if you want, using the transformation equation x' = y{x — Vt)\ with 
jc = 80 cm and t = 0, this gives the correct value x' = 100 cm.] As observed in 
S, the experiment is as shown in Figure 15.5. The right cleaver falls comfortably 
ahead of the snake, the student is right, and the snake is unharmed. 

What is wrong with the snake’s reasoning? To answer this, we must examine 
the coordinates and times at which the two cleavers bounce, as observed in S'. 
The left cleaver falls at t L — 0 and x L — 0. According to the Lorentz transfor¬ 
mation (15.20), the coordinates of this event, as seen in S' are 

t' L = y{t L -Vx L /c 2 ) = 0 

and 

x' L = Y(x L - Vt L ) = 0. 

As expected, the left cleaver falls just behind the snake’s tail, at time t' L — 0, as 
shown in Figure 15.6(a). 

So far there are no surprises. However, the right cleaver falls at t R = 0 and 
x R = 100 cm. Therefore, as seen in S', it falls at a time given by the Lorentz 
transformation as 

t' R = r(t R - Vx R /c 2 ) = -2.5 ns. 

(Check the numbers yourself.) The crucial point is that, as seen in S', the two 
cleavers do not fall at the same time. Since the right cleaver falls before the left 
one, it does not necessarily hit the snake, even though they are only 80 cm apart 



Figure 15.6 The snake paradox, as measured in the snake’s frame S'. The 
cleavers move to the left with speed V, and the right one falls 2.5 ns before 
the left one. Even though the cleavers are only 80 cm apart, this lets them 
land 125 cm apart. 
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(in this frame). In fact, the position at which the right cleaver falls is given by 
the Lorentz transformation as 

x' R = y(x R - Vt R ) -- 125 cm. 

The right cleaver does indeed miss the snake! 

The resolution of this paradox, and many similar paradoxes, is that two events 
that are simultaneous in one frame are not necessarily simultaneous in a different 
frame — an effect sometimes called the relativity of simultaneity. As soon as 
we recognize that the two cleavers fall at different times in the snake’s frame, 
there is no longer any problem understanding how they can both contrive to miss 
the snake. 


15.7 The Relativistic Velocity-Addition Formula 


As our next, and very important, application of the Lorentz transformation, let us use 
it to derive the relativistic velocity-addition formula. This formula is the answer to 
the following question: If an object — an electron, a baseball, a planet — is moving 
with velocity v relative to an inertial frame S, how can we calculate its velocity v' 
relative to some other frame S'? In classical physics, the answer to this question is the 
classical velocity-addition formula: If V denotes the velocity of S' relative to S, then 
v' = v — V. (Presumably, whoever named this formula wrote it as v = v' + V.) For 
the special case that the axes of S and S' are parallel and V is in the x direction (our 
“standard” configuration), this becomes 

v' x = v x — V, v' y = v y , and t/ = v z . (15.23) 

Our task now is to find the corresponding relativistic result. 

Consider a particle moving with position r(t) or r'(t'), as seen in S or S'. The 
definition of the velocity v is the derivative 



where dr = r 2 — iq is the infinitesimal displacement between the positions at times t\ 
and t 2 — % + dt. Now, we can write down the Lorentz transformation for (x 2 , y 2 , z 2 , t 2 ) 
and (x h y x , z h q), and taking differences, we find 

dx' = y(dx — Vdt), dy' = dy, dz' = dz, dt'= y(dt — Vdx/c 2 ). (15.25) 

(Notice that dr and dt satisfy exactly the same transformation equations as r and t. 
This is because the Lorentz transformation turned out to be linear.) Using the definition 
(15.24), we can write down the components of v', and substituting (15.25) we find 
for U 

, _ dx' _ y(dx — Vdt) 
x dt' y(dt — Vdx/c 2 ) 
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or, canceling the factors of y and dividing top and bottom by dt. 


, v x ~V 

v = —- .- . -■ 

* 1 ~v x V/c 2 


( 15 . 26 ) 


Similarly, 


/ _ dy' __ dy 

Vy ~ ~dt' ~ y(dt - Vdx/c 2 ) ‘ 


Dividing top and bottom by dt, we find for i/ (and similarly v'j 


Y( 1 - v x V/c 2 ) 


y(l-v x V/c 2 ) 


( 15 . 27 ) 


Notice that v' y ^ v y even though dy' = dy. This is because dt' 0 dt. Notice also that 
y is the factor pertaining to the speed V of S' relative to S; that is y — \0\ — V 2 /c 2 . 

The three equations in (15.26) and (15.27) are the relativistic velocity-addition 
formulas or the relativistic velocity transformation. If all velocities are much less than 
c, then y % 1 and we can ignore the second term in the denominators, and we recover 
the classical results (15.23). However, when the velocities concerned approach c, the 
relativistic velocity transformation can have some surprising results, as the following 
examples illustrate. 


example 15.4 Adding Two Velocities Close to c 

A rocket travelling at speed 0.8c relative to the earth shoots forward bullets with 
speed 0.6c (relative to the rocket). What is the bullets’ speed relative to the earth? 

If we choose frames in the usual way, with S fixed to the earth and S' fixed 
to the rocket, then V = 0.8c and v' = 0.6c. Our task is to find v. The classical 
answer is of course v = v' + V = 1.4c. The relativistic answer is given by the 
inverse of (15.26), which we can find by the usual trick of exchanging primed 
and unprimed variables and reversing the sign of V. The result (from which I 
omit the subscripts x since all velocities are in the jc direction) is 


v'+ V 
1 + v'V/c 2 

0.6 c + 0.8c 1.4 

-=-c i to 0.95c. 

1 + 0.8 x 0.6 1.48 


(15.28) 


The striking thing about this is that when we “add” 0.8c to 0.6c we get an answer 
that is less than c. In fact, it is fairly easy to prove that for any velocity with v' < c, 
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the corresponding v is also, automatically, less than c. (See Problem 15.43.) That 
is, anything that travels with speed less than c in one frame has speed less than 
c in all frames. 


example 15.5 Adding Two Velocities One of Which Equals c 

The rocket of Example 15.4 shoots forward a signal (a pulse of light, for instance) 
with speed c relative to the rocket. What is the signal’s speed relative to the earth? 
Here v' = c, so (15.28) becomes 


v' + V _ c + V 
1 + v'V/c 2 ~ 1 + V/c 


(15.29) 


That is, a signal traveling in the x direction with speed c relative to S' also 
has speed c relative to S. This result is actually true whatever the direction of 
travel, as you can prove in Problem 15.43. It asserts that the speed of light is 
invariant under the Lorentz transformation, in obedience to the second postulate 
of relativity (which led us to the Lorentz transformation in the first place). 


15.8 Four-Dimensional Space-Time; Four-Vectors 


The Lorentz transformation (15.20) mixes up space and time, in the sense that each of 
the equations for x' and t' involves both x and t. The Russian-German mathematician 
Hermann Minkowski (1864-1909) suggested that this mixing of space and time 
implies that time should be combined with the three spatial coordinates to form a 
four-dimensional space-time, on which the Lorentz transformations act as a kind of 
rotation. Before we examine this suggestion, let us review a couple of facts about the 
ordinary rotations of our everyday three-dimensional space. 


Rotations of Ordinary Three-Dimensional Space 

In discussing the vectors of ordinary three-dimensional space, it will be convenient to 
change our notation a bit: For any choice of orthogonal axes, I shall use the notation 
mentioned in Chapter 1, with the three unit vectors labeled e j, e 2 , e 3 . The components 
of a general vector q we’ll call q x , q 2 , q 3 , so that 


3 

q = q x e x + q 2 e 2 + q 3 e 2 = Wi (15.30) 


where, as usual. 


<k = e, -q. 


(15.31) 
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To conform with this notation, I shall, from now on, rename the position vector 
r = (x, y, z) as x = (x 1; x 2 , x 3 ). 

Now consider a rotation which carries the axes defined by e h e 2 , e 3 into a second 
set with unit vectors e'j, e 2 , e' 3 . The components qj of the same vector with respect 
to the new axes are easily found: 

3 3 

q'i = e'j -q = e' r J2 <lj e j = * e yH/ • (15.32) 

7 = 1 7 = 1 

This equation expresses each of the coordinates q' { in the new coordinate system as a 
sum over the coordinates qj in the old system. (That is, it does for rotations what the 
Lorentz transformation does when we pass between frames in relative motion.) The 
coefficients in this sum are the scalar products e',- • tj of the unit vectors of the new 
and old systems. 13 

We can express (15.32) more compactly if we adopt the matrix notation of Chapter 
10: Let R denote the 3 x 3 square matrix with elements 

R ij =e' i -e j (15.33) 

and let q and q' denote the 3 x 1 columns made up of the coordinates 



~<h~ 



q = 

42 

_ *?3 _ 

and q' = 

<?2 

-?3- 


[As in Chapter 10, it is good to stay a little relaxed about this notation. When there 
is any danger of confusion, we’ll agree that q is a column matrix as in (15.34), but, 
when there is no such danger, we’ll continue to call q a “vector” and even write it 
as the row (q h q 2 , g 3 ).] With these notations, the rotation (15.32) takes the compact 
form 


q' = Rq. (15.35) 

The effect of rotating our axes is to multiply the column q of coordinates q t by a 
certain 3x3 rotation matrix R. 


example 15.6 A Simple Rotation about One Axis 

Consider a set of rectangular axes with the x x x 2 plane chosen horizontal and the 
% axis vertically upward, and suppose we rotate these axes about the x 2 axis 
through an angle 6, to give a new set of axes as shown in Figure 15.7. Find the 
rotation matrix R for this rotation. 

The required matrix R is easily written down using (15.33). Inspection of 
Figure 15.7 let’s one evaluate the necessary scalar products to give 


13 The numbers e', • e y - are often called the direction cosines of the new axes with respect to the 
old, since e' f • e y - = cos 9^ where 6 i - 1 is the angle between e',- and e j . 
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*3 

e 3 


Figure 15.7 The primed axes are obtained from the unprimed by a 
counter-clockwise rotation about the x 2 axis (into the page) through 
an angle 6. The x 2 direction is unaffected by this rotation and the unit 
vectors e 2 and e' 2 both point into the page. 


cos 6 0 sin 6 ~ 

R= 0 10. (15.36) 

_ — sin 9 0 cos 0 _ 

The effect of the rotation R on the coordinates of any point is x' = Rx or 
Xj = (cos#)*! + (sin#)x 3 

*' = x 2 (15.37) 

*3 = (— sin#)*] + (cos 0 )x 3 . 

One of the best arguments for regarding the three numbers x h x 2 , x 3 as 
coordinates in a single three-dimensional space is that rotations can mix them up 
as in (15.37). One could imagine people taking the view that vertical distances 
(x 3 ) were somehow fundamentally different from horizontal ones (x ] or x 2 ). 14 
But surely such people would be dissuaded from this view when they noticed 
that (15.37) mixes x x and x 3 together (and, for 0 = tt/ 2, simply exchanges their 
roles). We shall now argue similarly that Lorentz transformations are a kind of 
rotation that mixes the space and time coordinates in a four-dimensional space- 
time. 


Lorentz Transformations as “Rotations” of Space-Time 

A glance at the Lorentz transformation (15.20) should convince you that it mixes x 
and t in somewhat the same way that the rotation (15.37) mixes x t and x 3 . We can 
make this parallel surprisingly close by polishing our notation. First, we’ll rename our 
space coordinates x b x 2 , x 3 as above, and we’ll introduce a fourth coordinate 

x 4 = ct (15.38) 


14 Bizarre as such a view may appear, it does seem to be endorsed by some standard practices. 
For example, people in the business of storing water behind dams measure the volume of stored 
water in acre-feet, with horizontal areas measured in acres, but vertical depths in feet. 
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where the factor c guarantees that x 4 has the same dimensions as x h x 2 , and x 3 . 
Recalling the definition /3 = V/c, we can rewrite the Lorentz transformation 
(15.20) as 

*[ = Y x l - YP X 4 
*2= *2 
x'=X 3 

x 4 = ~YP x l + Y x 4- 

We can further improve the parallel with the rotation (15.37) if we note that since 
y > 1 we can define an “angle” (p such that y = cosh (p. Some simple algebra 
(Problem 15.30) should convince you that this makes y/3 = sinh (p and (15.39) be¬ 
comes 


(15.39) 


xj = (cosh 0)xj — (sinh0)x 4 
x ' 2 = x 2 
x ' 3 = x 3 

x 4 = (— sinh0)x! + (cosh0)x 4 


(15.40) 


and our parallel is as close as it can be. It is important to understand that no one would 
claim that the Lorentz transformation (15.40) mixes x, and x 4 in exactly the way that 
the rotation (15.37) mixes x t and x 3 — the trig functions of (15.37) have become hy¬ 
perbolic functions in (15.40) (and a sign has changed). Nevertheless, the parallel is 
close and is a powerful argument for regarding x 4 = ct as the fourth coordinate in a 
four-dimensional space-time or just four-space. 


Four-Vectors 

The four numbers x h x 2 , x 3 and x 4 = ct constitute a vector in four-dimensional 
space-time. Such vectors are called four-vectors to distinguish them from the three- 
dimensional, vectors, such as the position three-vector x = (x l5 x 2 , x 3 ). Unfortunately, 
several different notations are used for four-vectors. I shall use ordinary italic letters 
for four-vectors; for example, 


X = (Xj[,'X 2 , x 3 , x 4 ) = (x, ct) 

for the position-time vector just discussed. We shall be meeting several other four- 
vectors (for example, the four-momentum p, to be defined shortly). My notation for 
an arbitrary four-vector will be 

q = (q h q 2 , q 3 , q 4 ) = (q,$ 4 ), 

where the bold face q, comprising the first three components of q, is called the spatial 
component of q and the fourth component q 4 is called the time component. 15 


15 This notation has two drawbacks you should be alert to: (1) Since we’ll still be using italic 
symbols for various scalars (for example, m for mass), you will need to tell from the context whether 
an italic symbol is a four-vector or a scalar. (2) We can no longer use the convention that q is the 
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As with three-vectors, it is often convenient to understand four-vectors to be 4 x 1 
columns , 


X — 

~x { ~ 

x 2 

and 

q = 

i 

i_ 


x 3 



43 


_*4_ 



_44J 


With this notation, the Lorentz transformation (15.39) can be written in matrix form 
as 


(15.42) 


where A is the 4x4 matrix 


Y 0 0 —y/3~ 
0 10 0 
0 0 10 
_—y/3 0 0 y 


[standard boost]. 


(15.43) 


This is not the most general Lorentz transformation. It is the transformation between 
two frames in what we have called standard configuration, with corresponding axes 
parallel and with the velocity of S' relative to 8 along the x axis. For many purposes, 
this standard transformation is the only one we need to consider, but we should take 
a moment to discuss more general transformations. 

Any Lorentz transformation which leaves corresponding axes parallel is called a 
pure boost or just boost, since all it does is “boost” us from one frame to another 
traveling at constant velocity relative to the first, without any rotation. The general 
transformation involves some rotation as well. If the transformation is a pure rotation 
(no relative motion, just a change of orientation) then of course t' = t and only the 
three spatial coordinates get changed. Thus we can write a pure rotation in the form 
(15.42), where the 4 x 4 matrix A has the block form 


' 

0 

R 

0 


0 

0 0 0 

T 


[pure rotation] (15.44) 


where R is the 3 x 3 matrix of the given rotation. (If you’ve never seen this kind of 
block matrix before, write out the equation x' = A R x in all its detail, and notice how 
it rotates the three spatial coordinates but leaves the fourth component unchanged, so 
t' = t.) 

If we want to write down a pure boost A B in an arbitrary direction u, we can con¬ 
struct it from a couple of rotations plus a standard boost; First we rotate so that our new 
x { axis points along the required direction u; next we make the standard boost (15.43); 


magnitude of the three-vector q. Instead, we’ll just use |q| (though I’ll continue to use r for the 
magnitude of the position vector, r = |x|). 
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and then we rotate back to our original orientation. Finally, any Lorentz transforma¬ 
tion A can be expressed as the product of a boost followed by a suitable rotation, 
A = A r A b . 16 For some practice at handling different Lorentz transformations, see 
Problems 15.32 to 15.34. 

A four-vector is defined as anything that transforms in the same way as the 
space-time vector x — (x,ct) under all Lorentz transformations (15.42). The formal 
definition is this: 


Definition of a Four-Vector 

In each inertial frame S, a four-vector is specified by a set of four numbers 
q = (q { , q 2 , qy, q 4 ) such that the values in two frames $ and $' are related by the 
equation q f = A q, where A is the Lorentz transformation connecting S and S'. 


Obviously the space-time vector x — (x, ct) fits this definition, and we shall find 
several more examples in the next few sections (including the four-momentum p 
mentioned above). 

The great merit of the notion of four-vectors is that it often allows one to check with 
almost no effort whether a proposed physical law is relativistically invariant. Suppose 
for example we believe that there should be a law of the form 


q = p (15.45) 

where we know that q and p are four-vectors. (The law of conservation of momentum 
has this form, p fin = p in , as we shall see.) Suppose further that the law is true in 
one frame S. Since the corresponding values in any other frame S' are q' = Aq and 
p' = Ap, we have only to multiply both sides of (15.45) by A and we see that q' = p'. 
That is, the truth of our proposed four-vector law (15.45) in one frame S assures its 
truth in any other frame S'. (Of course, this doesn’t guarantee that the law is true — 
only experiment can test that — but it does guarantee that the law would be consistent 
with the postulates of relativity.) 

Any single quantity that is invariant under rotations is called a rotational scalar or 
a three-scalar; for example, the mass m of an object is a three-scalar, and so is the time 
t. In the same way, any single quantity that is invariant under Lorentz transformations 
is called a Lorentz scalar or four-scalar. For example, we shall see that the mass m 
(if suitably defined) is a four-scalar; that is, the mass of any object has the same value 
in all inertial frames. On the other hand, the time t is not a four-scalar; rather, as we 
have already seen, it is the fourth component of a four-vector. 


16 Actually we still don’t have the most general transformation, since the spatial origins of our 
two frames still coincide at t = t' = 0. This can be taken care of by shifting one of the origins, but 
this additional possibility need not concern us here. 
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15.9 The Invariant Scalar Product 


A set of transformations — such as the set of all rotations in three-space, or all Lorentz 
transformations in four-space — can often be characterized by the quantities that 
they leave invariant. Our main interest here is, of course, in the Lorentz transfor¬ 
mations, but, for a little guidance as to what we should expect, let us look first at 
rotations. 


The Invariant Scalar Product in Three-Space 

One of the most obvious properties of rotations in three-space is that they do not 
change the length of any vector. If we define 

5 = x • x = x 2 + x 2 4- x 2 (15.46) 

as the length squared of any vector x, then the value of s = x • x in one coordinate 
system is the same as its value s' — x' • x' in any other system obtained from the first 
by rotation. (In the terminology just introduced, s is a rotational scalar.) Since this 
is true for any x, we can replace x by x = a + b (where a and b are any two other 
vectors) and the invariance of (a + b) 2 implies that 


a ■ a + 2a • b + b • b = a 7 • a 7 + 2a 7 • b 7 + b 7 • b 7 . (15.47) 


Canceling the terms that we already know to be equal, we find that 

a * b = a 7 • b 7 . (15.48) 

In other words, invariance of the length of any three-vector under rotation implies 
that the scalar product of any two three-vectors is invariant. We shall employ a similar 
argument for the scalar product of four-space in a moment. 


The Invariant Scalar Product in Four-Space 

We can construct a scalar product in four-space with several of the properties of its 
three-dimensional analog. For any four-vector x = (x t , x 2 , x 3 , x 4 ) = (x, ct), let us 
define 


.s = x 2 + x 2 2 + x 2 - x 2 = r 2 - c 2 t 2 . (15.49) 

This s is obviously a generalization of the three-dimensional length squared, but 
note well the minus sign on the fourth term. (It is because of this minus sign that 
Lorentz transformations of space-time are not exactly analogous to rotations of 
ordinary space.) The quantity .v is invariant under Lorentz transformations, as we can 
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easily prove: Consider first the standard boost (15.39). Under this transformation, the 
quantity 5 becomes 

t ,2 . ,2 . ,2 ,2 

s =X X +X 2 + X 3 — X A 

= K 2 (*1 - px 4 ) 2 + x 2 + x 2 - y 2 (.~Px x + x 4 ) 2 
= K 2 (l - + X 2 + x 2 - y 2 ( 1 - p 2 )x 2 

= s 

where the last equality follows because y 2 (l — /l 2 ) sg 1. Therefore, the quantity s is 
invariant under the standard boost. But we have seen that any Lorentz transformation 
can be built up from a standard boost and rotations, and s is certainly unchanged by a 
rotation (since r 2 and t are separately invariant under rotation). Therefore ,v is invariant 
under any Lorentz transformation. 

Before we discuss the significance of the new invariant quantity s, we can use 
it to define an invariant scalar product in four-space. For any two four-vectors 
x = (x lf x 2 , x 3 , x 4 ) and y = (y x , y 2 , y 3 , y 4 ), we define 17 


X * y - Xtfi + x 2 y 2 + x 3 y % " ^ 4 > 4 * ( 15 . 50 ) 


(Again, note well the minus sign on the fourth component — this “scalar product” 
is a little different from the usual scalar product in ordinary space.) Obviously the 
invariant 5 of (15.49) is just s = x - x. And, as with rotations, the argument leading 
from (15.47) to (15.48) implies that because x • x is invariant for any one four-vector 
x, the scalar product x • y is invariant for any two four-vectors x and y. We shall see 
that the scalar product x • y plays as big a role in relativity as the ordinary product 
a • b does in classical physics. The scalar product of any four-vector x with itself is 
often written as x • x = x 2 , and can be called the “invariant length squared” of x, but 
you must not be misled by this terminology into thinking that x 2 is positive. On the 
contrary, x 2 can obviously be positive, negative, or zero. 

For future reference, we can rewrite the invariance of x • y as a property of the 
Lorentz matrices A: We know that x • y = x' • y' whatever the values of x and y, 
provided x' — Ax and y' = Ay where A is any Lorentz transformation. Therefore, 
we can say that, for any Lorentz transformation A, 

x * y = (Ax) • (Ay) (15.51) 

for any two columns of four numbers x and y. 

To understand where the scalar product came from, consider an experiment in 
which a flash bulb at the origin x = 0 is fired at time t = 0 in a frame §. The light 


17 Be warned! Physicists are fairly evenly divided between those who use the definition (15.50) 
and those who put a minus sign in front of the whole expression. Both conventions have their 
advantages and disadvantages. 
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from the flash will spread out at speed c, so that at any later time t it occupies the 
sphere r 2 = c 2 t 2 . Using our new notation, we can say that the spreading wavefront is 
located by the condition x • x = r 2 — c 2 t 2 = 0. Now, the invariance of x • x implies 
that x • x = 0 if and only if x' • x' — 0 in any other frame S'. Therefore a spherical 
wave spreading with speed c as seen in S will be a spherical wave spreading with 
speed c in S', and vice versa. We see that the invariance of the scalar product x • x is 
a reflection of the second postulate, that the speed of light is the same in all inertial 
frames. 


15.10 The Light Cone 


The scalar product x • x lets us divide space-time into five physically distinct regions. 
To help visualize this, it is convenient to ignore one of the spatial dimensions (x 3 say), 
so we can plot the remaining two spatial dimensions horizontally and x 4 = ct vertically 
up, as in Figure 15.8. (Mathematically, this amounts to confining our attention to the 
“plane” x 3 = 0.) Consider again the light from a flashbulb fired at the space-time origin 
O (that is, fired at x = 0 when t = 0). As time passes, the light moves outward in the 
X\X 2 plane on an expanding circle with r 2 = c 2 t 2 , and this sweeps out the upper half¬ 
cone shown in Figure 15.8. This cone, called the forward light cone, is therefore 
the set of all points in space-time that would be visited by light released from the 
origin O. Mathematically, it is the set of all space-time points x = (x, ct) satisfying 
x • x — r 2 — c 2 t 2 = 0 and t > 0. 


x 4 = ct 



Figure 15.8 The light cone is defined by the condition x • x = r 2 — c 2 t 2 = 0 
and divides space-time into five distinct parts: the forward and backward light 
cones, with t > 0 and t < 0 respectively; the interiors of the forward and 
backward light cones, called the absolute future and the absolute past; and 
the outside of the cone, labeled “elsewhere.” 





626 Chapter 15 Special Relativity 


The lower half-cone shown in Figure 15.8 is called the backward light cone and 
is the set of all space-time points x = (x, ct) with the property that light released from 
x could subsequently pass through the origin, O. The whole light cone (forward and 
backward) is made up of the straight lines representing the path of any light ray that 
passes through the origin. Since light travels at speed c, and x 4 was cunningly chosen 
to be x 4 = ct, these lines have slope 1, so the surface of the light cone (if drawn to 
scale) makes an angle of 45 “with the time axis. Since the light cone is defined by the 
condition that x • x — 0 and since x • x is invariant (has the same value in all frames), 
it follows that the light cone is itself an invariant concept. That is, observers in any 
two frames will always agree as to which points lie on the light cone. 


Interior of the Light Cone; Future and Past 


Consider next a space-time point P, with coordinates x = (x, ct), that lies inside the 
forward light cone. This obviously has t > 0 and r 2 < c 2 t 2 or 


x 4 > 0, and 

x 2 + x 2 + X 2 < x 2 (or x • x < 0). 


(15.52) 


These two conditions have a remarkable consequence: Notice first that since t > 0, 
we can assert that any event that occurs at P is later than any event at O, at least as 
observed in the frame § in which our coordinates x are measured. But what about 
some other frame S'? To answer this, note that the second condition in (15.52) is 
just that x • x <0, and we know that x • x is invariant under Lorentz transformations. 
Therefore x' • x' is also negative, and the second condition is satisfied in S' as well. 
To see that the first condition is also satisfied in S', suppose first that S' is related to S 
by the standard Lorentz boost (15.39), under which 


*4 = y(x 4 - 0Xi). (15.53) 

Now, we know that \f}\ < 1 and the second condition in (15.52) guarantees that 
|jcj| < x 4 . Therefore x' A > 0, and the first condition is also satisfied in S'. Since any 
Lorentz transformation can be made up of a standard boost and rotations (and rotations 
don’t change x 4 at all), we conclude that both conditions (15.52) hold in all frames if 
they hold in one frame. In other words, the statement that P lies inside the forward 
light cone is a Lorentz-invariant statement. In particular, if P lies inside the forward 
light cone, then all observers will agree that an event that occurs at P is later than one 
at O. For this reason, the inside of the forward light cone is often called the absolute 
future — “absolute” because all observers agree that P is in the future of O. In a 
similar way, if P lies inside the backward light cone, then P is earlier than O as 
measured by all inertial observers, and this region is called the absolute past. (See 
Problem 15.39.) 

So far we have considered the light cone with its vertex at the space-time origin 
O. This is defined by the light rays which happen to pass through O. If we considered 
instead light which passes through some other space-time point, Q, this would define 
a light cone with its vertex at Q — the light cone of Q. This cone would look just 
like Figure 15.8, except that the vertex would be at the arbitrary point Q rather 
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Figure 15.9 The light cone of an arbitrary space-time point Q with coordi¬ 
nates Xq = (Xq, ct Q ) is made up of all light rays that pass through Q. The 
point shown as P lies outside the light cone of Q. 


than the origin O, as shown in Figure 15.9. Any point P on this cone must satisfy 
( x P - Xq ) 2 = c 2 (t P - t Q ) 2 , so that 

(. x P - xq ) 2 = 0 (P on light cone of Q). (15.54) 

The inside of the forward light cone of Q is the absolute future of Q, all of whose 
points are later than Q as seen by all inertial observers. That is, for any point P inside 
the forward light cone of Q, all observers agree that t P > t Q . 

Exterior of the Light Cone; Space-Like Vectors 

The situation is entirely different for a point P that lies outside the light cone, as in 
Figure 15.9. First, the condition for P to be outside is that 

(x P - x Q ) 2 > c 2 {t P - Iq) 2 , (15.55) 

or, equivalently 

(. x P — xq ) 1 >0 (P outside light cone of Q). (15.56) 

This condition is symmetric between P and Q. Thus if P is outside the light cone 
of Q, then Q is outside the light cone of P, and vice versa. It is clear from Figure 
15.9 that there are points P outside the light cone of Q, with P later than Q (that is, 
t P > t Q ), and others for which P and Q are simultaneous (t P = t Q ), and still others 
with P earlier that Q (t P < t Q ). There is nothing remarkable about this claim; it is a 
straightforward consequence of the geometry of Figure 15.9. What is remarkable is 
the following proposition (which I shall prove directly): 

Proposition 

Let P be any given space-time point outside the light cone of a second given 
point Q. Then 

( 1 ) there exist frames § in which t P > t Q 
but 

(2) there also exist frames S' in which t' p — t' Q 
and 

(3) there also exist frames S" in which t p <t" Q . 
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This startling proposition implies that the time ordering of any two given events, 
each outside the other’s light cone, can be different in different frames: Where one 
observer says that event A occurred before event B, a second observer can find 
them the other way around (and a third can find them to be simultaneous). This has 
profound implications related to the notion of causality: If one event A (an explosion, 
for instance) is the cause of another event B (the collapse of a distant building), 
then A must obviously occur first in time, since causes always precede their effects. 
According to our proposition, if the space-time point P is outside the light cone of 
Q, then neither Q nor P is unambiguously first in time. (In some frames it’s one, and 
in some frames it’s the other.) Therefore, nothing that happens at Q can be the cause 
of anything that happens at P, nor the other way around. Now, any kind of signal 
traveling from Q to P would have to travel with speed greater than c. [This follows 
from (15.55).] Conversely, if a signal emanating from Q had speed greater than c, 
then it could travel to some point P outside Q' s light cone. It follows that no causal 
influence can travel faster than the speed of light. 1 * Because the region outside the 
light cone of Q is completely immune to anything that happens at Q, this region is 
sometimes called the “elsewhere” of Q. 

To simplify the proof of our proposition, let us put the point Q at the origin O, 
and abbreviate the coordinates of P to x = (x, ct). (The general case is no harder; it 
is just a bit messier notationally.) By making a rotation if necessary, we can put x on 
the positive v, axis, so that 


jc = (jcj, 0,0, jc 4 ). (15.57) 

Now let us assume that statement (1) above is true (so that x 4 > 0), and prove 
statements (2) and (3). [Obviously one of the statements must be true, and you can 
easily check that our arguments work equally well starting from either (2) or (3).] Let 
us now make the standard boost (15.39) to a new frame S' in which 

*4 = k(*4 - £*i)- (15.58) 

Since P is outside the light cone of O, x 2 > c 2 t 2 which for the vector (15.57) means 
that x 1 > x 4 . Therefore we can choose fi = x 4 /x l < 1, and, according to (15.58), 
x' 4 = 0. That is, t' — 0, and we have proved statement (2) above for the case that 
Q = O. If P was outside the light cone of an arbitrary point Q, the corresponding 
boost would give a vector x' p — x’ Q with its fourth component equal to zero, and hence 
t' p =t' Q , again as required. 

A four-vector whose fourth component is zero can be described as a pure-space 
vector, and one which can be brought into this form by a Lorentz transformation is 
called space-like. That is, a four-vector is space-like if there is a frame in which it is 
a pure-space vector, with zero fourth component. With this terminology, we can say 


18 If the causal signal could have speed greater than c, then it could travel from Q to some P 
outside <2’s light cone, but we have just seen that this is impossible. 
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that the outside of the light cone is made up of all space-like vectors. Similarly, we 
can rephrase the result about causal relations to say that, if the separation x P — x Q of 
two points P and Q is space-like, then nothing that happens at P can influence what 
happens at Q, nor vice-versa. 

To prove statement (3) [from the assumed truth of statement (1)] we have only to 
look at the transformation (15.58) again. Since x x > x A we can choose fi a little bigger 
than x A /x x but still less than 1, and, with this choice, we get a frame (S", say) in which 
t" < 0 (or, in general, tp < t'^) as required. 


Time-Like Vectors 

An argument similar to that just given for space-like vectors (Problem 15.44) shows 
that if a four-vector q lies inside the light cone (that is q • q < 0), then there exists a 
frame S' in which it has the pure-time form q' = (0,0,0, g'). Naturally, therefore, we 
describe vectors inside the light-cone as being time-like. These can then be subdivided 
into forward time-like vectors (with q A > 0) and backward time-like (with q A < 0). 

An important example of a forward time-like vector is the displacement four-vector 
dx of any material particle in a time dt. As we shall discuss shortly, a material particle 
can be defined as any particle with positive mass (m >0). Equivalently (as we shall 
see), it is any particle for which, at any given time, there exists a rest frame; that 
is, a frame in which the particle is at rest, with v = 0. It is a matter of experience 
that all of the normal constituents of matter — electrons, protons, neutrons — have 
this property, and likewise all composites, such as atoms, molecules, baseballs, and 
stars. 19 Suppose now that, between the times t and t + dt, a material particle moves 
from x to x -1- dx, and consider the four-vector displacement 

dx = (dx, c dt) = (v, c) dt. 

In the particle’s rest frame, dx = 0, and dx has the pure-time form dx = (0, 0, 0, cdt). 
Therefore, dx is time-like in all frames, and dx 2 < 0. Since 

dx 2 = (v 2 - c 2 ) dt 2 < 0, 

we conclude that v 2 < c 2 in all frames; that is, material particles cannot travel with 
speeds greater than or equal to the speed of light. Notice that this is the third sense 
in which we have proved the speed of light acts as a universal speed limit: (1) The 
relative speed of any two inertial frames is always less than c. (2) In any one inertial 
frame the speed of any causal signal is always less than or equal to c. And now (3) in 
any inertial frame, the speed of any material particle is less than c. 


19 As we shall discuss in Section 15.16, the only common particle which does not have this 
property is the photon, the particle of light. Since this travels at speed c in all frames, it certainly 
does not have a rest frame. Naturally, we do not regard a photon as a material particle. 
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15.11 The Quotient Rule and Doppler Effect 


As a beautiful application of the properties of four-vectors, we next discuss the 
Doppler effect — the change in frequency of a wave due to the motion of the wave’s 
source or the observer. Before we do this, we need to derive one more important 
property of four-vectors, the quotient rule. 


The Quotient Rule 

Suppose that we find a quantity k that is specified by four numbers k = (k x , k 2 , k 3 , k 4 ) 
in every inertial frame S. It is naturally tempting to think that k is a four-vector, but 
this is not necessarily so. For example, in discussing the motion of an object of mass 
m, charge q, volume V, and temperature T, we could define 

k = (m, q, V, T ), 

and this set of four numbers would be defined in every frame but is fairly obviously 
not a four-vector; that is, its value in one frame S is not related to the value in another 
frame S' by the Lorentz transformation k' = A k. Although this example may seem 
a bit artificial, it does show clearly that not every quantity k with four components 
is necessarily a four-vector. On the other hand, k is a four vector if it satisfies the 
conditions of the following theorem: 


The Quotient Rule 

Suppose that x is known to be a four-vector and that, in every inertial frame, 
k = (k h k 2 , k$, k 4 ) is a set of four numbers, and suppose further that for every 
value of x the quantity <f> = k * x = k l x l + k 2 x 2 4- k$x 3 — k 4 x 4 is found to have 
the same value in all frames (that is,<£ = & *x isa four-scalar), then k is a four- 
vector. 


The proof of this rule is surprisingly simple: First, that <p is a four-scalar implies that 
k-x=k’-x'. (15.59) 

But from (15.51) we know that for any Lorentz transformation A 
k-x = (A k) • (Ax). 

Now, by assumption, x is a four-vector, so we can replace Ax by x', to give 

k'X = Ak’x'. (15.60) 

Comparing (15.59) and (15.60), we see that 


k' ~x' = (A k) • x '. 
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This equation is true for any choice of x'. If we choose x' = (1, 0, 0, 0), then it tells 
us that the first component of k' is equal to the first component of Ak, and continuing 
in this way we can show that all four components are equal, so that 

k' = A k, 

which shows that k (the “quotient” of the scalar 4> and the vector x) is indeed a four- 
vector. Armed with this quotient rule, let us return to the Doppler effect. 


Doppler Effect 

When you think of the Doppler effect, you probably think of the Doppler shift of 
sound. The sound of a police siren rushing toward us has a higher pitch, and then — 
as it passes us and speeds away — a lower pitch, than when the siren is stationary; 
as a train speeds past a crossing bell, the passengers hear first a higher pitch as the 
train approaches the bell, then a lower pitch as the train moves away. There is a 
corresponding effect with light and all other forms of electromagnetic waves. The 
famous “red shift” of light from stars is used routinely by astronomers to find how 
fast a star is moving away from us (and, indirectly, how far away the star is); in the 
“Doppler cooling” of atoms, the Doppler shift of laser light “seen” by a moving atom 
is used to selectively slow down fast moving atoms and hence bring groups of atoms 
to very low temperatures; and on the highway, the Doppler shift of radar bounced off 
the front of your car is used by the police to measure your speed. To derive the formula 
for the Doppler shift of light we must work relativistically. (Light travels at the speed 
of light!) Armed with our knowledge of four-vectors, we shall find that the derivation 
is surprisingly easy. 

Any sinusoidal plane wave has the form 

(f> — A cos(k • x — cot — 5 ) . (15.61) 

Here the nature of the function 0 depends on the wave under consideration; for a sound 
wave, it could be taken to be the pressure change produced by the sound; for light, it 
could be any component of the electromagnetic field. The vector k is called the wave 
vector; its direction is the direction of propagation of the wave and its magnitude is 
|k| =2ir/X, where A. is the wavelength; co is the angular frequency, co = 2jtv, where 
v is the ordinary frequency; and S is a (usually not very interesting) constant phase 
shift. The speed of the wave is u = <u/|k|; for light in a vacuum, this is of course c, 
so co = c|k|. 

Our main concern with the plane wave (15.61) is the phase k -x — cot. It is 
impossible to resist writing this as a four-dimensional scalar product 

k -x — cot = k -x, (15.62) 

where as always x = (x, ct) and k denotes the wave four-vector, 


k = (k, (o/c). 


(15.63) 
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(frame S') 



Figure 15.10 The Doppler experiment. A source of light is moving along the x, 
axis with speed V relative to the frame S. The observer in frame S sees the light 
from the source traveling at an angle 0 with the x x axis. The light’s frequency is 
oj as measured in S and co' = co 0 as measured in the source’s rest frame §'. 


To prove that k defined this way really is a four-vector, we note that the phase k ♦ x 
at any point x determines the position on the wave relative to the troughs or crests of 
the wave. Since this has to be the same in any frame, it follows that k • x is a four- 
scalar, and since x is certainly a four-vector, the quotient rule guarantees that k is a 
four-vector as its name implies. Since the fourth component of kisco/c and since we 
now know how k transforms, we are ready to find the frequency of a light signal as 
measured in a frame relative to which the source is moving. 

The experiment we have in mind is shown in Figure 15.10. An observer at rest in 
the frame S observes a railroad car moving at speed V along the x x axis. The car is 
emiting light of angular frequency oj' = co 0 , as measured in the car’s rest frame S'. If 
the light reaching the observer travels at an angle 9 with the x x axis, we want to know 
its frequency co, as measured by the observer. 

The wave four-vector k of the light reaching the observer has the form k — (k, k 4 ) 
where k 4 = co/c = |k|. According to the standard Lorentz boost, 


K = y(k 4 - pk x ). 

Setting k' 4 = (o'/c, k 4 = co/c, and k x = |k| cos 9 — ( co/c ) cos 9, we find that 
co' = yco(l — fi cos 9). 

Solving for co and replacing oj by co Q , we get the relativistic Doppler formula for 
light 


to = 


y(l - 0co&9) 


(15.64) 


where co 0 is the frequency of the light in the rest frame of the source, co is the frequency 
observed in a frame where the source has velocity V, and 9 is the angle between V 
and the observed light. 
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15.12 Mass, Four-Velocity, and Four-Momentum 


You may be becoming impatient that in eleven sections we have so far discussed only 
the kinematics of relativity. In fact, this reflects a truth about relativity — that many 
of its most interesting features, such as time dilation and the impossibility of causal 
signals traveling faster than the speed of light, are purely kinematic. Nevertheless, it 
is high time we took up relativistic dynamics, and this is what we now do. 

In this section, I shall introduce the relativistic definitions of the mass and mo¬ 
mentum of an object. It is important to recognize that there is really no such thing 
as the “correct” definition of a concept like mass or momentum when we move into 
the terra incognita of a new subject like relativity. Like Humpty Dumpty, we are, in 
principle, entitled to define words however we want. 20 Nevertheless, there are certain 
requirements of reasonableness that we can hope to impose: Any definition of mo¬ 
mentum should coincide as closely as possible with the nonrelativistic definition in 
the domain where the latter has proved useful — namely when the speed of the object 
is much less than c. And we would like our new definitions to share with their non¬ 
relativistic counterparts any properties that seem essential to the concept concerned. 
For example, we shall seek a definition of relativistic momentum with the property 
that the total momentum of an isolated system is conserved. 


Mass in Relativity 

There are, in fact, two different definitions of mass in relativity, both of which meet our 
requirements of reasonableness, and both with their supporters. The definition I shall 
use can be described as the invariant mass and is favored by the majority of practising 
physicists. The other, called the variable mass, is favored mostly by popularizers of 
relativity, since it makes some ideas seem easier at first. I shall describe the variable 
mass briefly later, but throughout this chapter I shall use the invariant mass, which is 
defined as follows: Given any object at rest (or moving with speed much, much less 
than c), we know that the nonrelativistic definition of mass produces a well-defined 
and useful quantity. To emphasize its definition, this mass is often called the rest 
mass. To define the mass of the same object traveling at high speed, we shall adopt 
the following, embarrassingly simple definition: 


Definition of Invariant Mass 

The mass, m,ofan object, whatever its speed, is defined to be its rest mass. 


If observers in an inertial frame 8 see an object sailing by at half the speed of light 
and want to know its mass, they must somehow bring the object to rest (or move 
themselves into a frame moving with the object) and then measure its mass using 


20 “When I use a word,” Humpty Dumpty said in a rather scornful tone, “it means just what I 
choose it to mean — neither more nor less.” Lewis Carroll, Alice’s Adventures in Wonderland. 
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any convenient technique of nonrelativistic mechanics. The equivalence of all inertial 
frames guarantees that this procedure will produce the same answer in all frames, 
so the resulting mass can be called the invariant mass. However, since it is the only 
definition we shall be using, we shall generally call it just the mass. Since the mass 
defined this way has the same value in all frames, it is a Lorentz scalar. 


The Proper Time of a Body 

Before we take up the definition of relativistic momentum, it is convenient to introduce 
two more important kinematic quantities. The three-dimensional position x(t) of a 
body at time t defines a point * = (x(r), ct) in space-time, and, as time advances, this 
point traces a path, called the body’s world line. We have seen that the separation dx 
between neighboring points x and x + dx on the world line of a material body is a 
time-like vector. This means that there is a frame (namely the body’s rest frame) where 
the separation is pure time-like, with the form dx 0 = (0,0,0, cdt 0 ). (The subscript 
“o” indicates the rest frame.) Since the two positions in three-space are equal (x 0 = 
x 0 + dx 0 ), the time dt 0 is the proper time between the two points on the body’s world 
line. To find this proper time, we don’t actually have to go to the rest frame. In any 
other frame, the separation has the form 

dx = (\dt,cdt) 

and since dx 2 = dx 2 , it follows that —c 2 dt 2 = (v 2 — c 2 )dt 2 , which we can solve for 
dt 0 to give 

dt 0 = dty/ 1 - v 2 /c 2 = — (15.65) 

y(v) 

where y(v) is the familiar y factor 1/yT — v 2 /c 2 , calculated for the body’s speed v. 
[You will recognize (15.65)as the time-dilation formula (15.11), so we didn’t really 
need to go through this calculation.] We can apply (15.65) in any frame S and will, 
of course, get the same value for dt 0 . That is, the proper time dt 0 is a Lorentz scalar, 
which makes it a convenient quantity to work with, as we shall see. 


The Four-Velocity 

We have seen that the three-dimensional velocity v of a body transforms according 
to the rather complicated velocity-addition formulas (15.26) and (15.27). The reason 
for the complication is easy to see: The three-velocity v = dx/dt is the quotient of a 
three-vector dx and the fourth component dt of a four-vector. So, little wonder that 
it transforms awkwardly. Having recognized the problem, we can easily construct a 
related vector which transforms more simply. If we considered u = dx/dt 0 instead of 
dx/dt, then at least the denominator would be a scalar. In fact, while we’re about it, 
we may as well consider the four-vector 


dx _ f dx dt\ 
dt 0 \dt 0 ’ dt 0 ) 


■M 


(15.66) 
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Since this four-velocity is the quotient of a four-vector and a four-scalar, it clearly is 
a four-vector. If we use (15.65) to replace dt 0 by dtjy, we find that 

u = Y = (15.67) 

\dt dt / 

The most prominent feature of this result is that the three-velocity v is not the spatial 
part of the four-velocity u (which is why I called the latter u rather than v). However, 
if our body is moving much slower than c, then y & 1 and the spatial part of the 
four-velocity is indistinguishable from the ordinary three-velocity v. As we shall see 
directly, the fact that u is a four-vector makes it useful in our efforts to construct a 
relativistic mechanics. 


Relativistic Momentum 

We are now ready to address the next definition in our relativistic mechanics — the 
definition of the momentum p of a body with mass m and velocity v. We obviously 
want our definition to agree with the classical definition (p = mv), at least at nonrela- 
tivistic speeds, |v| <§C c. What else we ask of our definition depends on what classical 
property of momentum we regard as so important that it should carry over to relativity. 
It would be hard to name the single most important property of momentum in classical 
mechanics, but the conservation of momentum is surely a strong candidate, and we 
shall look for a definition of p with the property that the total momentum P = X) p 
of an isolated system of bodies is conserved. To be consistent with the postulates of 
relativity, this law, if true at all, must be true in all inertial frames. 

The simplest possibility would be that we could continue to use the classical defi¬ 
nition p = mv, but we can dismiss this possibility fairly easily. With a little ingenuity 
one can construct a thought experiment in which the total classical momentum J2 mx 
is conserved in one frame S, but not in a second frame S'. One example, shown in 
Figure 15.11, is an elastic collision of two equal-mass particles a and b. As seen in 
frame S, the two particles approach the origin with equal and opposite velocities in 
the x t x 2 plane, and emerge with the x 2 components of their velocities reversed. Ob¬ 
viously, the total classical momentum, as measured in S, is zero Q2 mv = 0) before 
and after the collision, and classical momentum is conserved. Figure 15.11(b) shows 
the same experiment, as seen in a frame S' which travels along the x x axis of S with 
speed V equal to the component v x of particle a, so that particle a travels straight up 
and then down the x' 2 axis as seen in S'. Using the relativistic velocity transformations 
(15.26) and (15.27), one can find the four velocities as measured in S'. Since these 
calculations, although reasonably straightforward, are quite messy, I shall leave them 
as an exercise for the reader (Problem 15.54), but the important conclusion is easily 
stated: When we substitute the velocities of S' we find that the total classical momen¬ 
tum is not conserved in frame 21 S'; that is, ^ mv' n ^ ^ mvj^. Evidently, if we were 


21 The reason is actually fairly easy to see. The transformation (15.27) of the y component of 
velocity depends on the x component. Since particles a and b have different x components of velocity 
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(a) (b) 


Figure 15.11 An elastic collision between two equal-mass particles, a and b. (a) In 
frame S, the incoming particles approach with equal but opposite velocities and emerge 
with their x 2 components reversed. The total classical momentum ^ mv is zero before 
and after the collision, (b) The frame S' has velocity equal to the x l component of a’s 
initial velocity in §. Using the relativistic velocity-addition formula, it is easy to show 
that the total classical momentum ^ my' is not conserved in this frame. 


to adopt the classical definition of momentum, a law of conservation of momentum 
would be inconsistent with the postulates of relativity. 

The problem with the classical definition of momentum, p = mv, derives from the 
awkward transformation of the three-velocity v, and this suggests a more promising 
approach to our problem. Instead of using the three-velocity v, suppose we used the 
four-velocity u to define the four-momentum of any object of mass m as 


p =.m« = (ym\. ymc ) [definition of four-momentum], (15.68) 


[The last expression follows from (15.67).] Since m is a four-scalar and u is a four- 
vector, this defines p as a four-vector. If, in the usual way, we write 

P = (P> Pd (15.69) 

then this defines the three-momentum, p, as the spatial part of the four-vector p or, 
comparing with (15.68), 


p = mil =s ymv [definition of three-momentum]. (15.70) 


If our object is traveling slowly (|v| <$C c ), then y M 1 and our new definition of p 
agrees with the classical one, p = mv. In general, however, the two definitions differ 
by a factor of y, and it is the new definition (15.70) that proves useful. 


in S, their y components wind up with different magnitudes in S'. Thus mv ' y i s nonzero and 
actually changes sign when the velocities reverse in the collision. 
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What about conservation of momentum? Since we are now saddled with a four¬ 
dimensional momentum vector, it seems clear that conservation of momentum, if it 
is to be true at all, should be a four-dimensional law, which we could write as 

This is really four equations. The first three would be the law of conservation of 
the newly defined three-momentum, and the fourth would be the conservation of 
something else, namely the fourth component p 4 . Obviously, we need to find 
out quickly what this fourth component is, but I shall give this important question 
a section of its own. Briefly, though, we shall find that the fourth component of 
the new four-momentum is the energy (actually E/c), so that the law (15.71) is a 
wonderfully compact combination of the old laws of momentum and energy conser¬ 
vation. 

Here, what I want to emphasize is this: Since the four-momentum p is a four-vector, 
the same is true of both sides of (15.71). Therefore, if (15.71) is true in one frame S, 
it is automatically true in all frames; that is, our proposed law of conservation of four- 
momentum is compatible with the postulates of relativity. Whether the law is actually 
true must, of course, be decided by experiment. As you have no doubt guessed, the 
verdict is clear: Countless experiments have shown that the total four-momentum of 
an isolated system is constant. 


Variable Mass 

Some physicists like to rewrite the definition (15.70) of the relativistic three- 
momentum by introducing a variable mass 


m var = Y(v)ni. (15.72) 

With this definition the three-momentum becomes 

p = m var v. (15.73) 

This has the advantage that it makes the relativistic momentum look like its nonrela- 
tivistic counterpart, p = rav. Nevertheless, it has important disadvantages, which have 
led the majority of practising physicists to avoid the use of the variable mass. First, it 
is not necessarily a good idea to make a new definition look like its older counterpart 
when there are, in fact, important differences. Second, the introduction of the variable 
mass fails to achieve a complete parallel with classical mechanics. For example, it is 
not true that the relativistic kinetic energy (which we shall define in the next section) 
is equal to ±ra var u 2 , nor is it true (in general) that F = m var a. (See Problems 15.59 and 
15.79.) Third, unlike the invariant mass, the variable is not a Lorentz scalar. For all of 
these reasons, I shall not use the variable mass here. 
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15.13 Energy, the Fourth Component of Momentum 


The conservation of four-momentum means that, for any isolated system, there are 
four conserved quantities. The first three comprise the newly defined three-momentum 
p. But what about the fourth component? We are going to find that, for any freely 
moving object, the fourth component of the four-momentum defined by (15.68) is 
the energy divided by c. This is such an important result that I’ll state it as a formal 
definition and then discuss its justification and consequences. Accordingly, we make 
the definition 


Definition of Relativistic Energy 

The energy £ of a freely moving object with four-momentum p = (p, p 4 ) is 
£ = p 4 c ss ymc 1 . 


(15.74) 


where the second expression follows from (15.68), which implies that p 4 — ymc. 
With this definition, we can rewrite the four-momentum p as 

P = (p, E/c), (15.75) 

which explains why the four-momentum p is also called the momentum-energy 
four-vector. 

In partial justification of the definition (15.74), notice first that (15.74) does at least 
have the dimensions of energy, namely [mass x speed 2 ]. Next, let us look at E for a 
nonrelativistic object to see if it looks like the nonrelativistic energy. With u « c, we 
can expand y using the binomial series to give 

Y = [1 - {v/cfT V2 = 1 + \{v/c) 2 + • • •, (15.76) 

so that (15.74) becomes 


E ~ me 2 + \mv 2 (15.77) 

provided v <§C c. In nonrelativistic mechanics, mass was believed to be absolutely 
conserved, so the term me 2 would have been considered to be constant. Since the zero 
of energy was arbitrary, a classical physicist would have interpreted (15.77) to say that 
the newly defined E is just the classical kinetic energy plus an irrelevant constant. 

To illustrate the result (15.77), consider for a moment an elastic collision. Suppose, 
for example, that two atoms a and b, with masses and ra“ and nonrelativistic 
speeds i/ n and v™ collide and re-emerge with final masses m® n and mjj n and speeds 
vf and uj n . (Of course, in classical mechanics, the initial and final masses would be 
the same, but we’ll use different labels to avoid prejudging this question.) In any two- 
body collision, conservation of the newly defined relativistic energy (15.74) implies 
that 


£“ + E™ = £^ n + £j n 
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or, if the collision is nonrelativistic, 

[m”c 2 + + |m”c 2 + jm'£(v j,") 2 ] 

= [mfc 2 + .>«”(^”) 2 ] + [m«V 2 + |m«"(»J") 2 ] . 

Regrouping terms, we can write this as 

M in c 2 + T in = M fin c 2 + T fin (15.78) 

where M m denotes the initial total mass, T m the initial total kinetic energy, and so 
on . 22 Now, according to classical ideas, mass is conserved, so M in = M fin . Therefore, 
the two mass terms in (15.78) would cancel, leaving just 

yin j ™fin 

That is, the total kinetic energy would be conserved — which is precisely what we 
know to be true in an elastic collision. Thus, in the context of nonrelativistic elastic 
collisions, conservation of the newly defined relativistic energy (15.74) coincides with 
the familiar conservation of classical energy. This is perhaps the simplest and strongest 
single argument for regarding the definition (15.74) as an appropriate generalization 
of the classical notion of energy (together, of course, with the experimental fact that 
energy defined this way is conserved). 

The argument of the last paragraph gives us a reassuringly familiar result in the 
case of elastic collisions. However, it is going to lead us to the first big surprise of our 
relativistic mechanics. We know that, even in the context of nonrelativistic mechanics, 
there are inelastic processes, in which total kinetic energy is not conserved. For 
example, in the case of our two atoms, the collision could disturb the internal motion 
of the atomic electrons, changing the internal energy of one (or both) of the atoms. In 
this case we know that the atoms would emerge with changed total kinetic energy, so 
that T fin 7 ^ T in . (That such processes can occur is a well-established experimental fact; 
the Franck-Hertz experiment, in which electrons collided inelastically with mercury 
atoms, was a famous example.) Now, the argument leading to (15.78) depends only on 
the (true) assumption that relativistic energy is conserved, so (15.78) must apply to any 
possible nonrelativistic collision. In particular, in an inelastic collision with T fin ^ T m , 

(15.78) implies that the total mass of the two atoms has to change, M fin ^ M m . If 
we imagine an inelastic collision in which one of the atoms has its internal energy 
changed while the other is completely unchanged, then for the first atom we can say 
this: If the atom gains internal energy (is “excited”), then T im < T m and hence, from 

(15.78) , M fin > M in ; that is, when an atom gains internal energy it has to gain mass. 
Conversely, if the atom loses internal energy, it has to lose mass. 

If relativistic energy is conserved (as it is), then it follows logically that mass cannot 
be conserved. The first question that must be addressed is why this nonconservation 


22 By “kinetic energy” I mean here the kinetic energy \mv 2 of the translational motion of either 
atom as a whole. Of course, an atom may also have kinetic energy of its electrons as they orbit the 
nucleus, but we shall include that as part of the atom’s internal energy. 
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of mass had not been discovered much sooner. To answer this, notice that (15.78) can 
be written as 


A Me 2 = -AT. (15.79) 

The brief answer to our questions is this: By everyday standards, c 2 is an exceedingly 
large quantity, so that, even if AT is fairly large, AM = AT/c 2 is still very small — 
in most cases, unobservably so — as the following example illustrates. 


example 15.7 Mass Change in the Franck-Hertz Experiment 

In their famous experiment in 1914, James Franck and Gustav Hertz fired 
electrons through a container of mercury vapor. The mercury atom has a state 
whose internal energy is 4.9 eV (1 eV = 1.6 x 1CT 19 J) higher than the atom’s 
normal “ground state.” In some of the collisions between the electrons and the 
mercury atoms, a mercury atom was excited into this state, with the result that the 
final kinetic energy of the emerging particles was 4.9 eV less than the initial; that 
is, AT = -4.9 eV. By how much was the mass of the mercury atom increased 
by the collision? 

According to (15.79), the increase of mass is 

AM = = 8.7 x 10“ 36 kg. 


(Check the conversion and the arithmetic yourself.) The electron emerges with 
its mass unchanged (it is still just an electron), so all of this mass increase goes 
to the mercury atom (which is now an excited mercury atom). The increase is 
fantastically small by everyday standards, but the real question is this: How big 
is the mass increase compared to the original mass of the mercury atom, which 
is 200.6 atomic mass units or 3.3 x 10 -25 kg? This fractional change in the mass 
of the atom is 


Am _ 8.7 x 10~ 36 _ 
m 3.3 x 10 -25 


2.6 x 10“ n . 


This fractional change is far too small to be detected by any direct measurement 
of masses. 


The energy released in typical chemical reactions is also of order a few eV per 
atom. For example, the burning of hydrogen in oxygen can be thought of as an inelastic 
collision 


H 2 + H 2 + 0 2 -* H 2 0 + H 2 0 (15.80) 

in which the total kinetic energy of the final two molecules is about 5 eV more than that 
of the original three 23 Conservation of relativistic energy requires that the outgoing 


23 It is not a coincidence that the energy release (or loss) in chemical reactions is about the same 
as that in the Franck-Hertz experiment of Example 15.7. In both cases the change originates in 
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water molecules have less total mass than the initial hydrogens and oxygen, but, again, 
the difference is far too small to be detected by direct mass measurements. 

In nuclear reactions, the kinetic energy released can be much greater. For example, 
in the neutron-induced fission 

n + 235 u 90 Kr + 143g a + n + n + n 

the kinetic energy increases by about 200 MeV, and the fractional loss of mass is about 
1 part in 1000 — still not very large, but large enough to measure directly for many 
nuclear reactions. As we shall see later there are processes in which the mass change 
is even larger, but the evidence from nuclear physics is already sufficient to confirm 
the relativistic prediction (15.79) beyond reasonable doubt. 


Mass Energy 

We have seen that the term me 2 in the result (15.77), E ~ me 2 + \mv 2 , is certainly 
not the “irrelevant constant” that a classical physicist would have taken it to be. In 
fact, in relativity, unlike classical mechanics, the energy does not contain an arbitrary 
constant. This is because we certainly want the four-momentum p = (p, E/c) to be a 
four-vector, and the addition of a constant to E would destroy this desirable property. 
(See Problem 15.66.) Bearing this in mind, let us look again at the relativistic definition 
of the energy of an object, E = ymc 2 . Even if the object is at rest, with y = 1, the 
object still has some energy, given by E = me 2 (perhaps the most famous equation in 
all of physics). This energy is naturally called the rest energy of the object or, since 
it is associated with the mass m, the mass energy. 

The concept of mass energy lets us interpret the inelastic processes discussed above 
as processes in which some mass energy is converted into kinetic energy or vice versa. 
In the processes of atomic and nuclear physics, this conversion typically involves only 
a tiny fraction of the total mass energy, but there are processes in which there is 100% 
conversion. For example, in a collision between an electron (e“) and its “antiparticle” 
the positron e + , both particles can be annihilated, 

e~ + e + -* radiation 

with 100% of their mass energy becoming the energy of electromagnetic radiation. 

When an object is moving, y > 1 and its energy E — ymc 2 is greater than its rest 
energy me 2 . This suggests that we define a quantity T by the equation 

E = mc 2 + T. (15.81) 

This T is the additional energy that an object has by virtue of its motion and is naturally 
called the kinetic energy, 


T = E — me 2 = (y — 1 )mc 2 . 


(15.82) 


differences in the energy levels of the electrons in the atoms or molecules, and these differences are 
almost always of order a few eV. 



642 Chapter 15 Special Relativity 


When the object is moving slowly, we have seen that T ~ \mv 2 [this follows from 
(15.77)], but in general the nonrelativistic result is incorrect and we must use the 
relativistic definition (15.82). 


Three Useful Relations 

There are three useful relations among the parameters m, v, p, and E that characterize 
an object’s motion. First, since p = ym(v, c ) and also p = (p, E/c), we see at once 
that 


. 5 £ 

E ' 


(15.83) 


This relation lets you find the velocity of an object if you know its three-momentum 
p and energy E. 

Consider next the invariant “length squared” p 2 = p • p. In the object’s rest frame 
p has the form p = (0, 0, 0, me), so that p • p = —{me) 2 . Since both sides of this 
equation are invariant, it immediately follows that the same relation holds in any 
frame: For any object with four-momentum p and mass m, 


p.p = ~(mc) z (15.84) 


in any inertial frame. This relation is well worth memorizing and can greatly simplify 
several calculations, as we shall see. 

Finally, it is sometimes useful to rewrite the result (15.84) in terms of the three- 
momentum and the energy. Since p — (p, E/c), (15.84) becomes p 2 - {E/c) 2 = 
—{me) 2 or 


E 2 = {me 2 ) 2 + (pc) 2 . (15.85) 


This shows that the three quantities E, me 2 , and |p|c are related like the sides of a 
right triangle, with E as the hypotenuse, as indicated in Figure 15.12. At this stage 
there is no deep geometrical significance to this statement, but it does give an easy 
way to remember and visualize the relation (15.85). If the speed is much less than 
c, then y & 1, so E & me 2 ; in this case, the hypotenuse and base of the triangle are 
nearly equal, so T <$C me 2 and the triangle is very low (height <5C base). On the other 
hand, if v is very close to c, then y 1, so E me 2 ; in this case, T me 2 , so the 
energy is mostly kinetic, and the triangle is very tall (height» base) with E & |p|c. 
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Figure 15.12 The three parameters E, me 2 , and |p|c are related like 
the sides of a right triangle with E as the hypotenuse. 


example i5.a Energy and Momentum of an Electron 

The rest energy of an electron is about 0.5 MeV (actually 0.511, but for many 
purposes 0.5 is good enough). What is the electron’s mass in the SI unit (the 
kilogram) and in MeV/c 2 ? If its kinetic energy is T = 0.8 MeV, what is its total 
energy E and what is the magnitude |p| of its three-momentum in MeV/c? What 
is its speed? 

The given rest energy tells us that 

me 2 = 0.5 MeV. (15.86) 

Solving for m, converting eV to joules, and putting in the value of c, we 
find (as you can check) that m & 9 x 10“ 31 kg (more precisely, 9.11 x 10 -31 ). 
Straightforward as this calculation is in kilograms, it is even easier, and often 
much more convenient, in MeV/c 2 . We simply divide both sides of (15.86) by 
c 2 and we have the answer directly: 

m = 0.5 MeV/c 2 . 

Evidently, the mass in MeV/c 2 is numerically the same as me 2 in MeV. This 
is so simple that it takes a little getting used to. If you’ve never used the unit 
MeV/c 2 before, just keep reminding yourself that the statement m = 0.5 MeV/c 2 
is precisely equivalent to the statement that me 2 = 0.5 MeV. The reason this is 
such a convenient way to specify masses is that our real concern is often not 
with the mass itself, but rather with the corresponding energy me 2 , and the most 
convenient unit for the latter is often the MeV. 

If T = 0.8 MeV, then clearly E = T + me 2 = 1.3 MeV, and by the “useful 
relation” (15.85) 

|p|c = y/E 2 - (me 2 ) 2 = \/ 1.3 2 - 0.5 2 MeV = 1.2 MeV. 

Once again, we could get an answer in SI units by making the necessary 
conversions, but a simpler, and often more convenient, course is just to divide 
both sides by c to give 


|p| = 1.2 MeV/c. 
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Finally, according to the useful relation (15.83), the electron’s dimensionless 
speed f} = vjc is 



1.2 MeV 

1.3 MeV 


= 0.92; 


that is, v — 0.92 c. 

Notice how nicely the factors of c cancel if we measure masses in MeV/c 2 
and momenta in MeV/c when using the relations (15.83) and (15.85). This is 
because m and p enter these relations only through the combinations me 2 and 
pc. For some practice at using these relations and the new units see Problems 
15.61 to 15.63. 


15.14 Collisions 


The laws of conservation of energy and momentum play a key role in the analysis of 
collisions. In this section, I shall illustrate this claim with three examples. 


example 15.9 Collision of Two Lumps of Putty 

A relativistic ball of putty, with mass m a , energy E a , and velocity \ a , collides 
with a stationary ball of mass m b , as shown in Figure 15.13. If the two balls fuse 
to form a single lump, what is the lump’s mass m and with what velocity v does 
it move off? 

To find the final mass, we have only to recall that the invariant “length 
squared” of the four-momentum of any object is —m 2 c 2 . If we denote the final 
four-momentum by p fin , then 

(Pfin) 2 = -rn 2 c 2 . (15.87) 

By conservation of momentum-energy, p fin = p m , where p in is the total initial 
momentum; that is, p in = p a + p b , from which 

(Pin) 2 = iPa + Pb) 2 ~ Pa + Pb + 2 Pa * Pb 

--- — m 2 c 2 — m^c 2 — 2 E a m b (15.88) 


before 


O 

m b 


after 

O-t 


Figure 15.13 Two balls of putty collide and form a single lump. 
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where the last term comes about because p b = (0,0,0, m b c). Comparing 
(15.87) and (15.88), we find that the mass of the final lump is 

m = 

Notice that if the original motion was nonrelativistic, then E a & m a c 2 , and we 
recover the nonrelativistic result m =m a + m b \ but, in general, m > m a + m b . 

According to the useful relation (15.83), the final velocity is v = p bn c 2 /E bn . 
By conservation of four-momentum, we can replace the components of p fm by 
those of p in to give 


VgC 2 = Yqm a Vg 
Eg + m b c2 Ya m a + m b 


(15.89) 


where y a denotes the y factor for the incoming ball a. Notice that if v a c, this 
reduces to the familiar nonrelativistic result v = m a \ a /(m a + m h ). 


The CM Frame 

In nonrelativistic mechanics, we have seen that a very useful concept is that of the 
CM or center-of-mass frame — the frame in which the center of mass of a system 
is at rest. Alternatively, this frame can be characterized as the frame in which the 
total momentum is zero, P = p = 0. (So you can think of “CM” as standing for 
“center of momentum” if you like.) This alternative definition carries over directly 
to relativistic mechanics. 24 We have seen that the four-momentum p of any material 
particle is forward time-like (lies inside the forward light cone). 25 Now, it is a simple 
exercise to prove (Problem 15.69) that the sum of any number of forward time-like 
vectors is itself forward time-like. Therefore, the total momentum P = p of any 
collection of particles is also time-like, and this guarantees that there exists a frame 
in which P has the form P = (0, 0,0, P 4 ). Naturally, we define this frame, in which 
the total three-momentum P = 0, to be the CM frame of the system. 

It often happens that a collision problem is very easy to solve in the CM frame. 
Thus, if we need to solve the same problem in some other frame S, the simplest 
procedure is often to transform from § to the CM frame, solve the problem there, and 
then transform back to S, as the following two examples illustrate. 


24 Oddly enough, the notion of center of mass does not carry over satisfactorily into relativity. 
Thus, it is better to think of the CM frame as the center-of-momentum frame. 

25 So far, we are considering only material particles, that is, particles with mass m > 0. We shall 
soon discuss the case of massless particles, for which p lies on the light cone. Fortunately (with one 
small exception) the same results apply even when some of the particles are massless. See Problems 
15.88 and 15.89. 
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example 15.10 An Elastic Head-On Collision 

! Consider an elastic head-on collision between a projectile, with mass m a and 
j velocity v a , and a stationary target of mass m b , as shown in Figure 15.14(a). 

| (That the collision is head-on means that the two particles emerge from the 
collision both moving along the line of the incident velocity v a , as shown). What 
J is the final velocity \ b of the target particle bl 

Let us denote by S the lab frame, in which the experiment actually occurs 
(with b initially at rest), and take the direction of the incident velocity to be 
the x axis. To solve the problem directly in frame §, we would write down the 
equations of conservation of energy and momentum and solve for the requested 
| final velocity. Unfortunately, the equations are extremely messy, and a much 
| simpler course is to transform to the CM frame S'. In the CM frame the two 
j incoming three-momenta are equal and opposite, and it is easy to see (Problem 
15.68) that the collision simply reverses them both. Thus our procedure will be 
this: (1) Transform p b to the CM frame S'. (2) Reverse its spatial part p b . And (3) 
j transform back to S and calculate the velocity. Before we do this, we need to find 
the velocity of the CM frame S' relative to S. Since the total four-momentum is 

| P=(p 


the required (dimensionless) velocity /? to transform to the CM frame is 


_ P gC 

E a + m b c 2 


(15.90) 


(In the nonrelativistic limit, with p a m a \ a and E a ~ m a c 2 , this corresponds 
j to a velocity v = m a \ a /(m a + m b ), which is the velocity of the center of mass, 
as expected.) 

I We can now follow our three steps to solve the problem. In the lab frame, 
the initial four-momentum of the target b is 


p™ = (0,0,0, m b c ) [lab frame, initial]. 


(15.91) 


(a) Lab frame 8 


before 



after 

O • ® - 

a b 


(b) CM frame 


before 

Q-> 

a b 



after 


0—s* 
b 


Figure 15.14 (a) An elastic head-on collision as seen in the lab frame S, where 

the target b is initially at rest, (b) In the CM frame S', all of the three-momenta 
have the same magnitude. The only effect of the collision is to reverse the three- 
momentum of each particle. (The arrows represent momenta.) 
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Applying the standard Lorentz boost, with velocity (15.90), we find the corre¬ 
sponding CM momentum 

p b n = ym b c(—ft, 0, 0, 1) [CM frame, initial]. (15.92) 

In the CM frame the collision simply reverses the spatial components of this 
momentum. Thus the corresponding final momentum is 

p f 1 = ym b c(P, 0, 0,1) [CM frame, final], (15.93) 

Finally, transforming back to the lab frame, we find 

pf = y 2 m b c (ip, 0, 0, (1 + J0 2 )) [lab frame, final], (15.94) 

The corresponding dimensionless velocity is just the ratio 2/3/(1 + ft), so the 
actual final velocity of the target is 




(15.95) 


with P given by (15.90). 

The answer (15.95), although easily found, is not especially illuminating 
in general. In the special case that the two masses are equal, it is easy to show 
(Problem 15.73) that (15.95) reduces tov fc = v a ; that is, the target b emerges with 
precisely the velocity of the incoming projectile a, and the projectile therefore 
comes to a dead stop. This behavior is well known to students of nonrelativistic 
mechanics (and to billiards players) and shows that, in the case that m a = m b , 
the relativistic result (15.95) agrees exactly with the familiar nonrelativistic one. 

Whether or not the two masses are equal, it is easily shown that in the limit 
v a <<C c, the relativistic result (15.95) approaches the corresponding nonrelativis¬ 
tic one. (See Problem 15.73.) 


Threshold Energies 

Most of the elementary particles that have been discovered in the last seventy years or 
so were found when they were produced in collisions of other particles. For example, 
the negative pion, jv~, can be created in a collision of a proton and a neutron, 

p + n->p + p-|- 7t~. 

Similarly, the first antiproton to be observed was produced by a proton-proton colli¬ 
sion in the reaction 


p + p->p + p + p + p. (15.96) 

(The negatively charged antiproton p is the “antiparticle” of the proton, with the same 
mass but opposite charge.) A quantity of great concern to any experimenter hoping to 
observe this kind of reaction is the threshold energy, defined as the minimum energy 
of the initial particles for which the reaction can occur. 
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Let us consider a reaction of the form 


a + b^> d H-bg. 


In the traditional collision experiment, one of the original particles (let’s say b) was 
usually at rest — defining what we have been calling the lab frame. Thus, the concern 
was to know the threshold energy for a reaction of this kind in the lab frame. At first 
glance this would seem a simple matter to calculate. The minimum possible energy 
of the final particles is just their total rest energy E m fin c 2 , the energy they have 
when all at rest. Surely, then, the threshold energy for the reaction is just E m fin c 2 . 
Unfortunately this pausible argument is wrong. The trouble is that in the lab frame the 
total initial three-momentum is nonzero. (Particle b is at rest, so a has to be moving to 
bring in the necessary energy.) Conservation of three-momentum requires that the final 
three-momentum be nonzero, and so the final particles cannot all be at rest. Therefore, 
the threshold energy is more than just E m fin c 2 . But what is it? 

The easiest way to answer this question is to observe that in the CM frame, where 
the total three-momentum is zero, all of the final particles can be at rest. Thus 

£=„>£ m «" c2 (15-97) 

and the equality here is possible, with all final particles at rest. We can now find the 
threshold energy in the lab, by comparing the total four-momenta in the two frames. In 
the CM frame the total four-momentum has the form P cm = (0,0,0, E cm /c). In the 
lab frame, it is P lab = p a + p h , where p a = (p fl , E a /c) and p b = (0,0,0, m b c ) are 
the momenta of the two orginal particles. Now by the invariance of the scalar product, 

p <& = p uL’*° 

~ E L/ c2 = (Pa + Pb) 2 = Pa + Pb + 2 Pa * Pb = ~ m ^ 2 ~ 2E a m b 


or, solving for E a , 

E 2 — m 2 c 4 — m?c 4 

E a = - b —. 

2 m b c 2 

Inserting the minimum value of E cm from (15.97), we find for the minimum energy 
of the projectile a in the lab frame 


E min 


CEmfin) 2 ~ rn 2 - m 2 
2 m b 


(15.98) 


A famous example of the use of this equation was in the design of the experiment 
to verify the existence of the antiproton using the reaction (15.96). In this reaction 
E m fin = 4m p , while m a =m b = m p , so the minimuum energy (15.98) is 7m p c 2 ; that 
is, the minimum kinetic energy for protons to produce antiprotons by the reaction 
(15.96) was 6 m p c 2 ~ 5600 MeV. The reaction in question was first observed at 
Berkeley, using protons accelerated to this energy by a machine called the Bevatron, 
which had been specifically designed to accelerate protons to about 6000 MeV, just 
enough more than the threshold to be sure to do the job. 



Section 15.15 Force in Relativity 


649 


An important feature of (15.98) is that the leading term is proportional to (X) m fin ) 2 . 
Thus if the particle one is hoping to produce is very heavy, £™ n may be prohibitively 
large. For example, the particle called the \[r (or J/i/O has a mass of about 3100 MeV/c 2 
and was discovered in the reaction 


e + + e i ft 

where the positron and electron have mass about 0.5 MeV/c 2 each. Putting these 
numbers into (15.98), we see that, if this reaction was to be produced by firing 
positrons at stationary electrons (or vice versa), the minimum incident energy would 
have had to be a fantastic E mm ~ 10 7 MeV — well out of reach of any electron or 
positron accelerator in existence today. The way around this seemingly hopeless 
obstacle was to use colliding beams, with the electrons and positrons approaching 
one another with approximately equal and opposite momenta. That is, the experiment 
was done in the CM frame. From (15.97), you can see that the threshold in this case 
is only E cm ~ 3100 MeV. In this experiment, the advantages of this much smaller 
threshold energy far outweighed the disadvantages of having to work with two beams 
of high-energy particles. 


15.15 Force in Relativity 


We have not yet introduced the concept of force into our relativistic mechanics. 
One of the reasons for this is that force plays a much smaller role in relativity than 
in nonrelativistic mechanics. Another is that the concept of force is much more 
complicated in relativity. The most obvious complication is that (like several other 
parameters, such as mass and velocity) force can be defined in several different ways. 
A second complication arises from the possibility that the rest mass of an object can 
change. As we have seen, an inelastic collision of an electron with an atom can give the 
atom additional internal energy and increase its rest mass. For a macroscopic example 
of the same effect, imagine holding a flame under a metal object; the heat absorbed 
increases the object’s internal energy and its rest mass. Like most introductory texts, I 
shall avoid the complication of such “heat-like forces” by confining attention to forces 
that do not change the rest masses of the objects on which they act. 26 Fortunately, 
these include many of the important forces in special relativity, including the Lorentz 
force 


F = # + vxB) (15.99) 

on a charge q in electric and magnetic fields E and B. 


26 For a careful and clear discussion of “heat-like” forces, see the excellent book of Wolfgang 
Rindler, Introduction to Special Relativity, Oxford University Press, second edition, 1991, but be 
warned that Rindler tends to use m to denote the variable mass (what we would call ym). 
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Of the several conceivable definitions of force in relativity, the single most useful 
is probably the three-force defined as 



(15.100) 


where p denotes the relativistic three-momentum p = yrav. This is not, of course, 
the same as the nonrelativistic force, since p is not the nonrelativistic momentum, 
but it does have the essential merit that it agrees with the nonrelativistic definition 
when v c (and y ~ 1). A second property that recommends the definition (15.100) 
is that experiment shows that, with this definition, the force on a charge q in an 
electromagnetic field is given by the Lorentz equation (15.99). Thirdly, with the 
definition (15.100), we can prove the analog of the work-KE theorem, as follows: 
Recall the useful relation (15.85), that E 2 — (pc) 2 + (me 2 ) 2 . Differentiating both 
sides with respect to time, we see that (remember we’re assuming that the rest mass 
m doesn’t change, so there is no term in dm/dt, but see Problem 15.85) 


^dE 7 dp ? 
E — = pc 2 ■ —— = pc 2 
dt * dt 


or, dividing both sides by E and recalling that p c 2 /E = v, 


dE_ 

dt 


F. 


(15.101) 


Multiplying both sides by dt we find that 


dE — F -dx, (15.102) 

where dx denotes the displacement dx = \dt. Finally, since E = me 2 + T and we 
are assuming that the mass m does not change, we find that 


dT = ¥-dx (15.103) 

which is precisely the work-KE theorem, generalized to include relativistic energies 
and forces. 


example i5.li Motion with a Constant Force 

An object of fixed rest mass m is acted on by a uniform, constant force F (for 
example, the force on a charge in a uniform electrostatic field) and is released 
from rest at the origin at t = 0. Find the object’s three-momentum p, its three- 
velocity v, and its position x, all as functions of time. 
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Integrating (15.100) (with F constant), we find immediately 

p = Ft. (15.104) 

From (15.85) it is easy to see that y 2 = 1 + p 2 /(me) 2 , so 
y = a/ 1 + (Ft/me) 2 


and 



my 


Ft 

my /1 + (Ft/me) 2 


(15.105) 


When t is small, we can neglect the second term inside the square root, and we 
recover the nonrelativistic answer v = Ft/m, but when t gets large, the second 
term in the square root dominates and we find that v approaches c, without ever 
quite reaching it. This is consistent with our knowledge that no material particle 
can have speed greater than or equal to the speed of light. 

To find the object’s position x, we have only to integrate (15.105) to give 



(15.106) 


As you can easily check, when t is small, this reduces to the familiar nonrela¬ 
tivistic result x = ±F t 2 /m (that is, |at 2 ); when t oc, it is asymptotic to (ct + 
const), in the direction of F, as the speed approaches the speed of light (Problem 
15.82). 


Potential Energy 

It can happen that, at least in one frame S, the force F on an object is the gradient 
of a function C/(x); that is, F = —VC/(x) and the force is conservative. This is 
the case, for example, for a charge q moving in an electrostatic field. When this 
happens, the work done on the object as it moves through a displacement dx is 
F -dx = —VC/ ‘dx = —dU. Combining this with the work-KE theorem (15.103), we 
find that dT — — dU or d(T + U) = 0; that is, just as in nonrelativistic mechanics, if 
the force on an object is conservative, T + U is conserved. 


The Four-Force 

The three-force F = dp/dt is not the spatial part of a four-vector. (The trouble is that, 
although dp is the spatial part of a four-vector, dt is not a scalar.) In this respect, the 
three-force is like the three-velocity v = dx/dt, and the transformation of F from one 




652 Chapter 15 Special Relativity 


velocity, it is easy to see how to define a four-force that is closely related to the three- 
force. We can define the four-force on an object as the derivative of p with respect to 
the proper time t Q measured along the object’s world line: 



(15.107) 


(There is no widely accepted notation for the four-force, but K is one of the several 
notations used.) Since dp is a four-vector and dt 0 is a four-scalar, K is automatically 
a four-vector. Since dt 0 = dt/y, we can rewrite K as 

K = (K, K 4 ) = y(^-, = y(F, v-F/c) (15.108) 

\dt c dt ) 

where the last equality follows from (15.101). We see that the spatial part of the four- 
force is y times the three-force F, just as the spatial part of the four-velocity u is the 
y times the usual three-velocity v, as in (15.67). 

The advantages of the four-force stem from its being a four-vector. This means 
that its transformation from one frame to another is just the familiar Lorentz transfor¬ 
mation. It also means that the Lorentz invariance of any physical law formulated in 
terms of the four-force is easy to check. The main disadvantage of the four-force is 
that it gives the time derivative of momentum with respect to the proper time, where 
the three-force gives the derivative with respect to the time of any one inertial frame. 
Since our main interest is usually in the motion of an object in terms of the time in one 
particular frame (as opposed to the proper time of the moving object), the three-force 
is, in this respect, more useful. 


15.16 Massless Particles; the Photon 


A surprising consequence of relativity is the possibility of particles with' zero mass, 
m = 0. In nonrelativistic mechanics, the notion of a massless particle makes no sense 
at all. The definitions p = mv and T = \mv 2 show clearly that a particle with m = 0 
would have no momentum and no kinetic energy and, hence, would presumably be 
nothing at all. At first glance, the same argument might seem to apply in relativity: If 
we let m 0 in the relativistic definitions 

p — ymv and E = ymc 2 (15.109) 

we would seem to get the same conclusion — that a massless particle would have no 
momentum or energy and hence would not exist. Let us shelve this difficulty for a 
moment and look at the two relations (15.85) and (15.83) 

E 2 = (me 2 ) 2 + (pc) 2 and - = (15.110) 

c E 
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If there were to be a particle with m = 0 (and if these relations still applied) then the 
first relation would become 


E = |p|c [if m = 0] (15.111) 

and this, combined with the second relation, would imply that the particle’s speed had 
to be c, 


v = c [if m = 0]. (15.112) 

In other words, if there were to be a massless particle, then the usual relations of 
relativistic mechanics would require that it always travel with speed c. If we return now 
to our original definitions (15.109) of p and E, we see that as m —>• 0, so v —> c and 
y oo. Thus for a massless particle, the two definitions would take the form ooxO, 
which is undefined and hence does not actually rule out the existence of particles with 
m = 0. 

Evidently, relativistic mechanics has room for massless particles, always traveling 
at speed c. Whether such particles exist is, of course, a question for experiment, and 
experiment tells us unambiguously that they do. The photon is the particle that carries 
the energy and momentum of electromagnetic waves; and experiment shows that, for 
a photon, E and p do satisfy (15.111) and that photons do always travel (no surprise!) 
at the speed of light. 27 

Notice that with m = 0, the four-momentum of a photon satisfies 

p 2 = 0; (15.113) 

its invariant length squared is zero. We have seen that the four-momentum of a material 
particle (that is, a particle with m > 0) is always forward time-like. By contrast, that 
of any massless particle lies on the forward light cone and is forward light-like. 

Since the definitions (15.109) are no longer meaningful, the question naturally 
arises how the energy and momentum of a massless particle are defined. In principle, 
at least, they can be defined using the conservation laws. Consider, for example, the 
emission of a photon by an atom X: 

X*^X + y (15.114) 

where X* denotes an excited state of the atom, and X its ground state, and y is the 
standard symbol for a photon. Since we already know how to define and measure the 
energy and momentum of either state of the atom, we can find the corresponding values 
for the photon from conservation of energy and momentum. This definition must, of 
course, be checked for consistency. Would a different process yield the same answers 


27 It used to be thought that the neutrino was another example of a massless particle, but current 
evidence shows that its mass, though small, is definitely nonzero, perhaps about 10“ 6 times the 
electron mass. (For comparison, the experimental limit on a possible photon mass is of order 10~ 20 
times the electron mass.) On theoretical grounds it is generally assumed there must be a massless 
particle called the graviton, which does for gravity what the photon does for electromagnetism, but 
there is no direct evidence for the graviton. 
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for the same photon? For example, suppose we allowed the photon of (15.114) to 
collide with a second atom Y and eject an electron (the photoelectric effect): 

y + Y^Y++e (15.115) 

where Y + denotes the positive ion of Y with one electron removed. Using this 
process, we could measure the energy and momentum of the photon, and these second 
measurements should yield the same answers as the first. Experiment has shown 
repeatedly that the energy and momentum of the photon are consistently defined in 
this way. 

In fact there is a second way to find the energy and momentum of the photon. One 
of the first discoveries (due to Max Planck and Einstein) in the unfolding of quantum 
mechanics was that the energy of a photon is related to the frequency of its associated 
electromagnetic wave by the famous relation 

E = hco (15.116) 

where h is Planck’s constant (actually the original Planck constant h divided by 
2i r, h = h/2jt = 1.05 x 10 -34 J-s) and omega is the angular frequency of the wave. 
Similarly, the momentum of a photon is given by 

p = ftk (15.117) 

where k is the wave vector of the wave. Thus, E and p can be found by measuring 
the frequency and wave vector of the corresponding wave. It is a pleasing fact that the 
two relations (15.116) and (15.117) can be combined into a single four-vector relation. 
Since p = (p, E/c) and the wave four-vector is k — (k, co/c), the two relations imply 
that 


p = hk. (15.118) 

Since both sides of this equation are four-vectors, this relation is relativistically 
invariant; that is, the two quantum relations (15.116) and (15.117) are consistent with 
the principles of relativity. 

The relation (15.117) is often rewritten in terms of the wavelength X. Since |k| = 
2tt/X, 

|p|=fi|k| = 2|5=|. (15.119) 

In this form the relation is often called the de Broglie relation in honor of the French 
physicist Louis de Broglie (1892-1987), who first proposed that this relation should 
apply to the quantum wave associated with any particle — not just photons. For future 
reference, I’ll rewrite a photon’s four-momentum as follows: 

p = hk = h(k, = — (k, 1) (15.120) 


where the last equality holds since |k| = co/c, so k = (co/c) k. 
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The Compton Effect 

Historically, the most influential and persuasive evidence for the existence of massless 
photons obeying the relations of the last few paragraphs was the experiment of the 
American physicist Arthur Compton (1892-1962) in 1923. Compton fired X-ray 
photons at stationary electrons 28 and measured the increase in wavelength of the 
scattered photons. Classical theory would require that the wavelength of the scattered 
radiation should be exactly the same as that of the incident waves, but an increase in 
wavelength is easily explained if the radiation is carried by particle-like photons. When 
a photon collides with an electron, the electron recoils, taking some of the photon’s 
original energy. Thus the emerging photons must have less energy and hence less 
momentum than those in the incident beam. According to (15.119) less momentum 
means longer wavelength, and the increase in wavelength is explained. Using the 
relations of the last few paragraphs, Compton was able to calculate the expected shift 
in wavelength and his experiment triumphantly confirmed his calculations. 

The Compton experiment is shown schematically in Figure 15.15. The photon 
comes in from the left with four-momentum p yo and emerges at angle 0 with four- 
momentum p y , where according to (15.120) 

P yo =—( k 0 ,l) and p y i= —(k, 1). (15.121) 

c c 

The electron’s initial four-momentum is 

p 0 = (0,0,0, me) (15.122) 



Figure 15.15 A photon, labeled y, with four-momentum p y0 collides 
with a stationary electron. The photon emerges at an angle 0 with four- 
momentum p Y , and the electron recoils with four-momentum p. 


28 The target electrons were actually the valence electrons in the carbon atoms of a piece of 
graphite. Thus the electrons were certainly not perfectly stationary, but their kinetic energy (a few 
eV) was negligible compared to the energy of the X-ray photons (many thousands of eV). By the 
same token, the binding of the electrons (binding energy ~ a few eV) was unimportant with such 
high photon energies. 
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and it recoils with four-momentum p. Now, by conservation of four-momentum, 
Po + Pyo = P + Py or 


Po + (Pyo - Py) = P- 
Squaring both sides we find that 

Po + 2 Po * (Pyo - Py) + (Pyo ~ 2 Pyo * Py + Py) = P 2 ■ 

Since p 2 = p 2 = —m 2 c 2 , these two terms cancel, and since p\ | = p 2 = 0, these two 
terms drop out, and we are left with 

Po'iPyo- Py) = Pyo'Py (15.123) 

Substitution of (15.122) and (15.121), followed by a little algebra, yields (as you 
should check) 

C0 0 — CO = ——-(1 — COS 0)(O o (O 

me 2 


or 


i^_U2L (1 _ cose , 

co co 0 me 1 

Finally, replacing coby 2 ttc/X, we find the desired shift in wavelength, 

Ak = k-k 0 = — (l-cos0), (15.124) 

me 

where I have replaced 2nh by the original Planck constant h. This is the celebrated 
Compton formula for the shift in the wavelength of scattered radiation. That Comp¬ 
ton’s data agreed with this prediction at several angles gave strong support to the idea 
that the energy and momentum of radiation are carried by photons obeying the laws 
of relativistic mechanics, but with m = 0 . 


15.17 Tensors* 


* As usual, sections marked with an asterisk can be omitted on a first reading. 

In the next and final section of this chapter, I shall give a very brief account of the 
relativistic form of electromagnetic theory. Unfortunately, even this brief introduction 
requires some knowledge of the properties of tensors in four-dimensional space- 
time, and these four-tensors are the main subject of the present section. A complete 
account of four-tensors requires a rather elaborate machinery of “covariant” and 
“contravariant” vectors, but, for our present purposes we can manage without this 
formalism. Nevertheless, you should be aware that if you wish to pursue the study 
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of relativity much further, you will need to master this more elaborate machinery. 29 
Here I shall start my account by examining the transformation properties of three- 
dimensional vectors and tensors. 


Vectors and Tensors in Three Dimensions 

A three-vector a is characterized by its three components (in each frame) and by its 
transformation properties under rotations, as in (15.35), 

a=Ra, (15.125) 

where a and a! denote the columns made up of the three components in each of the two 
frames, and R is the (3 x 3) rotation matrix connecting them. In detail, this matrix 
equation reads 


a i = J2 R U a j 


(15.126) 


where the sum runs from j = 1 to 3. 

For future reference we need to establish an important property of the rotation 
matrices R. We know, of course, that any rotation leaves the scalar product a • b of 
any two vectors invariant; that is, a • b = a' • b', with a' given by (15.125), and likewise 
b\ Now, in matrix notation, the scalar product is 

a • b = ab (15.127) 

where we must now insist that b denotes the column of numbers b\,b 2 ,b 2 and a the 
row of numbers a l5 a 2 , a 3 . Thus, in matrix notation, the equation a • b = a' • b' reads 

ab = (Ra)~ (Rb) = a(RR)b. (15.128) 

[Here I used the result that (Ra)~ = aR — see Problem 15.94.] Since this must hold 
for any choices of a and b, it is easy to show (Problem 15.95) that 

RR = 1, (15.129) 

where, as before, 1 denotes the 3 x 3 unit matrix. Any matrix satisfying this condition 
is said to be an orthogonal matrix, so we have proved that rotations are given by 3 x 3 
orthogonal matrices. 

A three-dimensional tensor T comprises nine elements T Vj (in each three- 
dimensional Cartesian reference frame), where i and j both take values from 1 to 
3. (Strictly speaking, a tensor with nine elements T t j is a second-rank tensor. More 
generally, a tensor of rank n has 3” elements, but we shall only be concerned with 


29 For a clear and reasonably simple account see David J. Griffiths, Introduction to Electrody¬ 
namics , third edition (Prentice Hall, 1999), pp. 501 and 535. 
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the case n = 2.) To find the transformation properties of a tensor, let us consider the 
simple example of a tensor with elements 

T ij =a i b j (15.130) 

where a t and bj are the components of any two vectors. (For example, a could be the 
position of a particle and b its velocity.) This obviously has the requisite nine elements 
and is, in a sense, the prototypical tensor. 

The tranformation of the tensor (15.130) follows immediately from that of the two 
vectors from which it is constructed: 

= Y,R it Rj,T u . (15.131) 

k,l 

This transformation is the defining characteristic of a tensor. It closely parallels the 
transformation (15.126) for a vector, except that the tensor, with its two indices, gets 
two rotation matrices, one for each index. 

We can write the transformation (15.131) in matrix form if we note that (R) j7 = 
(R )ij, where R denotes the tranpose of R, so that (15.131)can be written 

T=RTR (15.132) 

Any three-dimensional (second-rank) tensor transforms according to this equation, 
and any set of nine elements which transforms in this way is a tensor. 

One of the most important operations one can perform with tensors is to multiply 
them by vectors. For example, the angular momentum L of a rigid body is given by 
the product L = Iw of the moment of inertia tensor I and the angular velocity vector 
a). It is important that any product of this form is, as we would expect, a vector, and 
this is easy to prove: Let T be any tensor and a any vector, and in every reference 
frame let the column of three numbers b be defined as b = Ta. To show that, with this 
definition, b is a vector, we use the properties (15.132) and (15.125) of T and a as 
follows: 


b' = TV = (RTR)(Ra) = RT(RR)a = RTa = Rb, (15.133) 


where the fourth equality follows because, according to (15.129), RR = 1. We con¬ 
clude that b, which is clearly a column of three numbers, transforms like a vector; that 
is, b = Ta is a vector. We can turn this argument around and prove (Problem 15.99) 
that if a and b are known to be vectors and in every frame it is found that b = Ta, 
where T is a 3 x 3 array of numbers (in every frame), then T satisfies (15.132) and is 
therefore a tensor. 
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Vectors and Tensors in Four-Dimensional Space-Time 

The discussion of vectors and tensors in four-dimensional space-time closely parallels 
that just given for three dimensions, with the only complications coming from the 
minus sign in the invariant scalar product. A four-vector is given by a column a of 
four numbers (in each frame), which transform under the Lorentz transformation 
a' = A a as we move from one inertial frame S to another, S'. This transformation 
leaves invariant the scalar product, which we can write in matrix notation as 


a*b = a x b x + a 2 b 2 + a 3 b 3 - a 4 b 4 = aGb (15.134) 


where b denotes the column for the four-vector b, a is the row for a, and G is the 
4x4 matrix 


"+1 0 0 0 ~ 

0+100 
0 0+10 
_ 0 0 0 


(15.135) 


This metric matrix, inserted between a and b in (15.134), simply changes the sign of 
the fourth component of b and so inserts the needed minus sign in the scalar product. 

The scalar product (15.134) is invariant when we replace a and b by A a and A b, 
and exactly the argument that led to (15.129) shows (Problem 15.98) that 


AGA = G 


(15.136) 


— the relativistic analog of the condition RR = 1 for rotations. The set of all 4 x 4 
matrices that satisfy the condition (15.136) is called the Lorentz group, since all 
Lorentz transformations must satisfy this condition. 

A four-tensor (strictly speaking a four-tensor of rank 2) is defined as a set of sixteen 
numbers T^ v (defined for every inertial frame §), where the indices /x and v run from 
1 to 4, which, when formed into a 4 x 4 matrix T, satisfy 


T' = AT A (15.137) 

— a property that exactly parallels Equation (15.132) for three-tensors. Just as we 
form the scalar (or dot) product (15.134) of two four-vectors by inserting the matrix 
G between the two appropriate matrices, so we can form a dot product of a tensor and 
a vector in the same way: 


T-a = TGa. (15.138) 

It is a straightforward exercise to show that, if T is any tensor and a any vector, then 
b = T-a is a four-vector. [The proof parallels (15.133) for three dimensions — see 
Problem 15.96.] Conversely, you can prove (Problem 15.99) a “quotient rule” that if 
a and b are known to be four-vectors and if b — T-a in every frame S, then T (the 
“quotient” of b and a) is a four-tensor. 

Armed with these definitions and properties of four-tensors, we are ready for our 
brief venture into relativistic electrodynamics. 
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15.18 Electrodynamics and Relativity 


That light travels at speed c in all directions is a consequence of the laws of classical 
electromagnetism, and special relativity grew out of the realization that the speed of 
light c is the same in all inertial frames. These two observations suggest that classical 
electromagnetism might already be consistent with the principles of relativity. The 
simplest way to prove this suggestion is to show that the familiar laws of electrody¬ 
namics can be written in terms of four-scalars, four-vectors, and four-tensors, so that 
their invariance under Lorentz transformations is self-evident. Here I shall do this for 
just one law, the centrally important Lorentz-force law 

F = ? (E + vxB). (15.139) 


In the process, we shall find the transformation rules for the electric and magnetic 
fields E and B. Before we address the properties of the fields E and B, you need to 
know that it is an experimental fact that the charge q of any particle has the same value 
in all inertial frames; that is, q is a Lorentz scalar. 

I shall take the view that the Lorentz equation (15.139) is an observed fact, valid 
in all inertial frames (which it definitely is). In the form (15.139), it certainly does not 
look relativistically invariant, and our task is to rewrite it so that it does. Our first clue 
for how to do this is to notice that (15.139) defines F as a linear function of v. The 
next, and very natural, step is to rewrite this linear relation in terms of the four-force 
K and the four-velocity u. Recall that 


K = 



and u = (yv, yc). 


(15.140) 


Multiplying both sides of (15.139) by y, you can see that K is a linear function 
of u. The simplest such relation would have the form K = qJ • u, where 3~ would 
be some as-yet unknown four-tensor (and I have inserted a separate factor of q 
since K is obviously proportional to q). In matrix form, this relation would read 
K =q3Gu [where K and u must now be seen as 4 x 1 columns, and G is the metric 
matrix (15.135)]. The 16 elements of the matrix TG can be found by writing out the 
components of K one at a time. For example, from (15.140) and (15.139), the first 
component of K is 


K i = yq(E i + v 2 B 3 - v 3 B 2 ) = q[B 3 u 2 - B 2 u 3 + (EJc)u A l (15.141) 


The coefficients of u h ■ ■ ■ , u 4 are just the first row of the matrix TG in the proposed 
relation K = q^Gu. Proceeding in this way, we find the whole matrix TG to be 


0 B 3 — B 2 EJc 

-B 3 0 E 2 /c 

B 2 — B\ 0 E 3 /c 

E\/c E 2 /c E 3 /c 0 


(15.142) 


[Compare the first row of this matrix with the coefficients in (15.141); for more details, 
see Problem 15.104.] Finally, since G 2 = 1 (the 4x4 unit matrix), we can multiply 
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by G on the right to give the electromagnetic field tensor 

0 B 3 -B 2 -E 3 /c~ 

-Z? 3 0 B ! -E 2 /c 

B 2 - B x 0 -E 3 /c 

_E x /c E 2 /c E 3 /c 0 _ 

in terms of which the Lorentz force takes the beautifully simple form 


(15.143) 


K^qfru. 


( 15 . 144 ) 


We are taking the view that the Lorentz-force law is an experimental fact, valid in 
all inertial frames. In the equation (15.144), K and u are known to be four-vectors, 
and the charge q is a scalar. It follows from the quotient rule quoted below Equation 
(15.138) that T is a four-tensor, which will let us find the behavior of the electric 
and magnetic fields under any Lorentz transformation. 30 For future reference, notice 
that T is an antisymmetric tensor; that is, the matrix T is antisymmetric, satisfying 
5= -T. 

Lorentz Transformation of Electric and Magnetic Fields 

The field tensor T specifies the fields E and B in any given inertial frame §. Since T 
is a four-tensor, its value in any other frame S' is given by (15.137) as 

T' = ATA. (15.145) 

For any given Lorentz transformation A, it is a straightforward, though tedious, 
exercise to work out the right side of (15.145), and comparing the result to the 
definition (15.143) of cF', we can write down the transformed fields. For example, 
for the standard boost with velocity v along the x l axis, one finds (Problem 15.105) 


£; = £„ E', = y(E 2 -()cB 2 ), E', = y(E, + 0cB 2 ) 

s; = s„ s; = y(B 2 + fiE 3 /c), s; = y(B 3 - (IE 2 /c). 


The most striking feature of the transformations (15.146) is that they mix up the 
electric and magnetic fields. A configuration whose fields are purely electric in one 
frame S (B = 0 everywhere, as for any static charge distribution), will inevitably have 
nonzero magnetic components in some other frames S' (B' ^ 0). Thus we can say that 


30 There are many different ways to arrive at the field tensor 3\ some of which define T to be 
a four-tensor. In such an approach, the Lorentz-force equation (15.144) automatically has the form 
“four-vector = four-vector,” which guarantees the Lorentz invariance of the Lorentz-force equation. 
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in relativity the existence of electric fields requires the existence of magnetic fields 
and vice versa. 

An important advantage of knowing the transformation properties of the elec¬ 
tromagnetic fields is this: In seeking the fields due to a certain charge and current 
distribution in a frame S, it may be possible to find a frame S' in which the fields 
are more easily evaluated. If this happens, then our simplest course may be to write 
down the fields in S' and then transform them back to the original frame S, as in the 
following example. 


example 15.12 Fields of a Long Straight Current 

Find the E and B fields of an infinitely long, uniform line charge with density 
X (measured in coulombs/meter), placed on the z axis of frame S and traveling 
| with speed v in the +z direction. 

The moving line charge constitutes a current I = Xv along the z axis, so our 
; problem is to find the combined fields of a line charge and a line current. Let us 
j recognize first that this can be done by elementary methods, without leaving the 
frame S: Using Gauss’s law we can show that the E field of the line charge is 
j E = 2 kX/p radially outward from the z axis. (Here k = 1/4 ne 0 is the Coulomb 
j force constant and p is the perpendicular distance from the z axis, that is, the 
first of the coordinates p,(p,z of cylindrical polar coordinates.) Similarly, using 
Ampere’s law, we can show the B field of the current is B = (/x 0 /2jr)//p in the 
direction given by the right-hand rule, where p 0 is the so-called permeability of 
space. We can express these two well-known results compactly using the unit 
j vectors of cylindrical polar coordinates: 

E=— p and b = — —(15.147) 

p 2n p 

Both of these fields are sketched in Figure 15.16(a). 

While the derivation using Gauss’s and Ampere’s laws is perfectly straight¬ 
forward, it is instructive to rederive the same results by transforming to a frame 
S' traveling with the charges. In S', there is no current, so the only field is the 
| radial electric field, E' = 2 kX'/p', as shown in Figure 15.16(b). This field is in 
the direction of the unit vector p' = ( x'/p ', y'/p', 0), so can be written as 

(15.148) 

P' P 2 

Before we transform this back to the original frame S, we must recognize that the 
charge densities X and X' are not equal: The total charge contained in any given 
j segment of the z axis must be the same in either frame (invariance of charge), so 
| that X Az = X'A z', but, because of length contraction, Az = Az'/y. Therefore 


1 = yX'. 


(15.149) 
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Figure 15.16 The fields produced by a line charge on the z axis, (a) In frame S, 
the line charge is moving up, out of the page. This constitutes a current, which 
produces a B field looping around the z axis — in addition to the E field, which 
is radially out from the z axis, (b) The frame S' is the rest frame of the charges, 
so there is no current and hence no B field —just the radial E field. 


We must now transform the fields E', given by (15.148), and B' = 0 back to 
frame S. To do this we note first that S' is traveling along the z axis of S (not the 
x axis, as in the standard boost). Thus we must first rewrite (15.146) for a boost 
along the z axis. We must then find the inverse of this transformation (since we 
want the unprimed fields in terms of the primed). The result is, as you can easily 
check, 

E 1 = y(E' l + ^cB' 2 ), E 2 = y(E' 2 - pcB[), E 3 = E' 3 { 

B\ — y(B[ — f3E' 2 /c), B 2 = y(B' 2 + PE[/0, B 3 — By 

Substituting B' = 0 and the components of E' from (15.148), we find that 

E = y^(x,y,0) = —~p. (15.151) 

P l P 

In writing the first equality I used the fact that x and y, and hence p = A ]x 2 + y 2 , 
are invariant under a boost in the z direction; in the second, I replaced y A.' by A.. 
This agrees exactly with the E field found in (15.147). 

Similarly, substituting B' = 0 and (15.148) into the expressions for B in 
(15.150), we find the magnetic field 


B = ^-—(-y,^,0). 
cp l 

If we make the replacements yA/ = A., ft = vjc, k/c 2 = l/(4jre 0 c 2 ) = 
and (— y/p, x/p, 0) = 0, this becomes 


2n p 


I 


(15.152) 
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| and, since Xv = /, the current, this is exactly the same as the B field in (15.147). 

The remarkable feature of this derivation of B is that it made no reference 
j to Ampere’s law. Gauss’s law in the frame S', combined with the Lorentz 
1 transformation of the fields, has given us the result that we normally see as 
an expression of Ampere’s law. 

With this striking example of the behavior of the electromagnetic field under the 
Lorentz transformation, I must end our brief foray into relativistic electrodynamics. 
You can explore a few more aspects in the problems at the end of this chapter, and 
after that you could read the excellent books of Griffiths and Jackson. 31 


Principal Definitions and Equations of Chapter 15 _ 

Time Dilation 

If two events, as observed in frame S 0 , occur at the same place and are separated by 
a time A t 0 , then the time between them as measured in any other frame S is 

At = y At 0 [Eq. (15.11)] 

where y — l/^/l — £ 2 , ft = V/c, and V is the speed of § relative to S 0 . 


Length Contraction 

If, as observed in frame S 0 , a body is at rest and has length / 0 , then its length measured 
in a frame S traveling with velocity V in the direction of the length is 

l=l 0 /y. [Eq. (15.15)] 

Lengths perpendicular to V are unchanged. 


The Lorentz Transformation 


The coordinates of any one event as measured in two frames (in standard configura¬ 
tion) are related by the Lorentz transformation: 


x' = y(x - Vt) 

y' = y 

z' = z 

t' = y(t - Vx/c 2 ). 


[Eq. (15.20)] 


31 Chapter 12 of David J. Griffiths, Introduction to Electrodynamics, (third edition, Prentice 
Hall, 1999) is at approximately the level of this book but naturally emphasizes electrodynamics 
much more heavily. J. D. Jackson’s Classical Electrodynamics (third edition, John Wiley, 1998) is 
a graduate text, which you could tackle after reading Griffiths’ book. 
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The inverse Lorentz transformation is obtained by exchanging primed and unprimed 
variables and changing the sign of V. 

The Velocity-Addition Formula 

The velocities of a single object as measured in two frames (in standard configuration) 
are related by the velocity-addition formula 

, V X — V , Vy V 

v = —---, v =-—, and v = - £ -—. 

* 1 ~v x V/c 2 y y(l — v x V/c 2 ) z y(l-v x V/c 2 ) 

[Eqs. (15.26) & (15.27)] 


Four-Vectors 

If we rewrite the coordinates (x,y,z) as (x h x 2 , x 3 ) and introduce x 4 = ct, then the 
four-vectors x = (x 1? x 2 , x 3 , x 4 ) label points in a four-dimensional space-time. If we 
agree to arrange the components of x in a 4 x 1 column, then Lorentz transformations 
become “rotations” of the form x' = Ax, where A is a 4 x 4 matrix. A four-vector 
is any set of four numbers, q = (q x , q 2 , q 3 , q 4 ) (one set for each inertial frame) which 
transform this way, 


q' = Aq. 


[Section 15.8] 


The Invariant Scalar Product 

The scalar product of two four-vectors x and y is defined as 

x • y = x x y x + x 2 y 2 + x 3 y 3 - x 4 y 4 [Eq. (15.50)] 

and is invariant under all Lorentz transformations. The scalar product of a vector with 
itself is often written as x • x = x 2 . 


The Light Cone 

The light cone of a point Q in space-time consists of all light rays through Q\ 
equivalently, it contains all points P with (x P — xq ) 2 = 0. [Section 15.10] 


The Relativistic Doppler Effect 

Light from a source traveling with velocity V relative to frame S is observed at an 
angle 0 (6 = angle between V and the ray of light). If the frequency of the light, as 
measured in the source’s rest frame is a> 0 the frequency observed in S is 


y( 1- P cos 9) 


[Eq. (15.64)] 
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Mass, Four-Velocity, Momentum, and Energy 

The (invariant) mass of an object is defined to be its rest mass. The four-velocity is 

u = — = y(v, c). [Eqs. (15.66) & (15.67)] 

dt Q 

The four-momentum is 

p = mu = (ymv, ymc) = (p, E/c). [Eqs. (15.68), (15.70), & (15.75)] 

Three Useful Relations 

/? = p c/E, p-p = -(me) 2 , and E 2 = (me 2 ) 2 + (pc) 2 . [Eqs. (15.83)-(15.85)] 

Three-Force and Four-Force 

The three-force F and four-force K on a particle are 

F=^ and K = ^-. [Eqs. (15.100) & (15.107)] 
dt dt 0 

Massless Particles 

With m = 0, a massless particle has 

E = |p|c, v = c, and p 2 = 0. [Eqs. (15.111)-(15.113)] 


Transformation of the Electromagnetic Fields 

Under the standard boost, the electric and magnetic fields transform as follows: 
E\ = E h E' 2 = y(E 2 - /3 cB 3 ), E' 3 = y(E 3 + pcB 2 ) 


B[ = B h B' 2 = y(B 2 + 0E 3 /c), B' 3 ^y(B 3 ~ PE 2 /c). 


[Eq. (15.146)] 


Problems for Chapter 15 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (★★★). 

section 15.2 Galilean Relativity 

15.1 ★ Using arguments similar to those of Section 15.2, prove that Newton’s first and third laws are 
invariant under the Galilean transformation. 
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15.2 ** Consider a classical inelastic collision of the form A + B C + D. (For example, this could 
be a collision such as Na + Cl -> Na + + Cl - in which two neutral atoms exchange an electron and 
become oppositely charged ions.) Show that the law of conservation of classical momentum is invariant 
under the Galiliean transformation if and only if total mass is conserved — as is certainly true in 
classical mechanics. (We shall find in relativity that the classical definition of momentum has to be 
modified and that total mass is not conserved.) 

section 15.4 The Relativity of Time; Time Dilation 

15.3 * A low-flying earth satellite travels at about 8000 m/s. What is the factor y for this speed? As 
observed from the ground, by how much would a clock traveling at this speed differ from a ground-based 
clock after one hour (as measured by the latter)? What is the percent difference? 

15.4 * What is the factor y for a speed of 0.99c? As observed from the ground, by how much would 
a clock traveling at this speed differ from a ground-based clock after one hour (one hour as measured 
by the latter, that is)? 

15.5 * A space explorer A sets off at a steady 0.95c to a distant star. After exploring the star for a short 
time, he returns at the same speed and gets home after a total absence of 80 years (as measured by 
earth-bound observers). How long do A’s clocks say that he was gone, and by how much has he aged 
as compared to his twin B who stayed behind on earth? 

[Note: This is the famous “twin paradox.” It is fairly easy to get the right answer by judicious 
insertion of a factor of y in the right place, but to understand it, you need to recognize that it involves 
three inertial frames: the earth-bound frame S, the frame S' of the outbound rocket, and the frame S" 
of the returning rocket. Write down the time dilation formula for the two halves of the journey and 
then add. Notice that the experiment is not symmetrical between the two twins: A stays at rest in the 
single inertial frame §, but B occupies at least two different frames. This is what allows the result to 
be unsymmetrical.] 

15.6 ★ When he returns his Hertz rent-a-rocket after one week’s cruising in the galaxy, Spock is shocked 
to be billed for three weeks’ rental. Assuming that he traveled straight out and then straight back, always 
at the same speed, how fast was he traveling? (See note to Problem 15.5.) 

15.7 ** The muons created by cosmic rays in the upper atmosphere rain down more-or-less uniformly 
on the earth’s surface, although some of them decay on the way down, with a half-life of about 1.5 /is 
(measured in their rest frame). A muon detector is carried in a balloon to an altitude of 2000 m, and 
in the course of an hour detects 650 muons traveling at 0.99c toward the earth. If an identical detector 
remains at sea level, how many muons should it register in one hour? Calculate the answer taking 
account of the relativistic time dilation and also classically. (Remember that after n half-lives, 2~ n of 
the original particles survive.) Needless to say, the relativistic answer agrees with experiment. 

15.8 ** The pion (tt + or jt~) is an unstable particle that decays with a proper half-life of 1.8 x 10~ 8 s. 
(This is the half-life measured in the pion’s rest frame.) (a) What is the pion’s half-life measured in a 
frame S where it is traveling at 0.8c? (b) If 32,000 pions are created at the same place, all traveling 
at this same speed, how many will remain after they have traveled down an evacuated pipe of length 
d = 36 m? Remember that after n half-lives, 2~" of the original particles survive, (c) What would the 
answer have been if you had ignored time dilation? (Naturally it is the answer (b) that agrees with 
experiment.) 
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15.9 ** One way to set up the system of synchronized clocks in a frame 8, as described at the beginning 
of Section 15.4, would be for the chief observer to summon all her helpers to the origin O and 
synchronize their clocks there, and then have them travel to their assigned positions very slowly. Prove 
this claim as follows: Suppose a certain observer is assigned to a position P at a distance d from the 
origin. If he travels at constant speed V, when he reaches P how much will his clock differ from the 
chief’s clock at 01 Show that this difference approaches 0 as V —» 0. 

15.10 *** Time dilation implies that when a clock moves relative to a frame 8, careful measurements 
made by observers in 8 will find that the clock is running slow. This is not at all the same thing as 
saying that a single observer in 8 will see the clock running slow, and this latter statement is not always 
true. To understand this, remember that what we see is determined by the light as it arrives at our eyes. 
Consider an observer standing close beside the x axis as a clock approaches her with speed V along 
the axis. As the clock moves from position A to B, it will register a time A t Q , but as measured by the 
observer’s helpers, the time between the two events (“clock at A” and “clock at 5”) is At — y A t Q . 
However, since B is closer to the observer than A is, the light from the clock at B will reach the observer 
in a shorter time than will the light from A. Therefore, the time At see between the observer’s seeing 
the clock at A and seeing it at B is less that At. (a) Prove that 

A r see = At (1-/3) = 

(which is less than At 0 ). Prove both equalities, (b) What time will the observer see once the clock has 
passed her and is moving away? 

The moral of this problem is that you must be careful how you state or think about time dilation. 
It’s fine to say “Moving clocks are observed, or measured, to run slow,” but it is definitely wrong to 
say “Moving clocks are seen to run slow.” 

section 15.5 Length Contraction 

15.11 * As a meter stick rushes past me (with velocity v parallel to the stick), I measure its length to 
be 80 cm. What is u? 

15.12 ★* Consider the experiment of Problem 15.8 from the point of view of the pions’ rest frame. 
What is the half-life of the pions in this frame? In part (b), how long is the pipe as “seen” by the pions 
and how long does it take to pass the pions? How many pions remain at the end of this time? Compare 
with the answer to Problem 15.8 and describe how the two different arguments led to the same result. 

15.13 ** (a) A meter stick is at rest in frame S 0 , which is traveling with speed V = 0.8c in the standard 
configuration relative to frame 8. (a) The stick lies in the x 0 y 0 plane and makes an angle 0 o = 60° with 
the x Q axis (as measured in S 0 ). What is its length l as measured in 8, and what is its angle 6 with the 
x axis? [Hint: It may help to think of the stick as the hypotenuse of a 30-60-90 triangle of plywood.] 
(b) What is l if 0 = 60° ? What is 8 0 in this case? 

15.14 *** Like time dilation, length contraction cannot be seen directly by a single observer. To explain 
this claim, imagine a rod of proper length l 0 moving along the * axis of frame 8 and an observer standing 
away from the x axis and to the right of the whole rod. Careful measurements of the rod’s length at 
any one instant in frame 8 would, of course, give the result l = l 0 /y. (a) Explain clearly why the light 
which reaches the observer’s eye at any one time must have left the two ends A and B of the rod at 
different times, (b) Show that the observer would see (and a camera would record) a length more than l. 
[It helps to imagine that the x axis is marked with a graduated scale.] (c) Show that if the observer 
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is standing close beside the track, he will see a length that is actually more than l Q ; that is, the length 
contraction is distorted into an expansion. 

section 15.6 The Lorentz Transformation 

15.15 * Solve the Lorentz transformation equations (15.20) to give x, y, z, t in terms of x', y'., z', t'. 
Verify that you get the inverse Lorentz tranformation (15.21). Observe that you could have found the 
same result by interchanging primed and unprimed variables and changing V to — V. 

15.16 * Consider two events that occur at positions r t and r 2 and times t } and t 2 . Let Ar = r 2 — r, 
and At = t 2 — t { . Write down the Lorentz transformation for r, and t h and likewise for r 2 and t 2 , and 
deduce the transformation for Ar and At. Notice that differences Ar and At transform in exactly the 
same way as r and t. This important property follows from the linearity of the Lorentz transformation. 

15.17 * Consider two events that occur simultaneously at t = 0 in frame S, both on the x axis at x = 0 
and x = a. (a) Find the times of the two events as measured in a frame S' traveling in the positive 
direction along the x axis with speed V. (b) Do the same for a second frame S" traveling at speed V 
but in the negative direction along the x axis. Comment on the time ordering of the two events as seen 
in the three different frames. This startling result is discussed further in Section 15.10. 

15.18 ★* Use the inverse Lorentz transformation (15.21) to rederive the time-dilation formula (15.8). 
[Hint: Consider again the thought experiment of Figure 15.3, with the flash and the beep that occur at 
the same positions as seen in frame S'.] 

15.19 ** A traveler in a rocket of proper length 2d sets up a coordinate system S' with its origin O' 
anchored at the exact middle of the rocket and the x' axis along the rocket’s length. At t' = 0 she ignites 
a flashbulb at O', (a) Write down the coordinates Xp, t' ¥ and Xg, t' B for the arrival of the light at the front 
and back of the rocket, (b) Now consider the same experiment as observed from a frame S relative 
to which the rocket is traveling with speed V (with S and S' in the standard configuration). Use the 
inverse Lorentz transformation to find the coordinates x F , t F and x B , t B for the arrival of the two signals. 
Explain clearly in words why the two arrivals are simultaneous in S' but not in S. This phenomenon is 
called the relativity of simultaneity. 

section 15.7 The Relativistic Velocity-Addition Formula 

15.20 * Newton’s first law can be stated: If an object is isolated (subject to no forces), then it moves 
with constant velocity. We know that this is invariant under the Galilean transformation. Prove that it 
is also invariant under the Lorentz transformation. [Assume that it is true in an inertial frame S, and 
use the relativistic velocity-addition formula to show that it is also true in any other S'.] 

15.21 * A rocket traveling at speed jc relative to frame S shoots forward bullets traveling at speed Ir¬ 
relative to the rocket. What is the speed of the bullets relative to S? 

15.22 * A rocket is traveling at speed 0.9c along the x axis of frame S. It shoots a bullet whose velocity 
v' (measured in the rocket’s rest frame S') is 0.9c along the y' axis of S'. What is the bullet’s velocity 
(magnitude and direction) as measured in S? 

15.23 * As seen in frame S, two rockets are approaching one another along the x axis traveling with 
equal and opposite velocities of 0.9c. What is the velocity of the rocket on the right as measured by 
observers in the one on the left? [This and the previous two problems illustrate the general result that 
in relativity the “sum” of two velocities that are less than c is always less than c. See Problem 15.43.] 
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15.24 * A robber’s getaway vehicle, which can travel at an impressive 0.8c, is pursued by a cop, whose 
vehicle can travel at a mere 0.4c. Realizing that he cannot catch up with the robber, the cop tries to 
shoot him with bullets that travel at 0.5c (relative to the cop). Can the cop’s bullets hit the robber? 

15.25 * A rocket is traveling at speed V along the x axis of frame 8. It emits a signal (for example, a 
pulse of light) that travels with speed c along the y' axis of the rocket’s rest frame S'. What is the speed 
of the signal as measured in 8? 

15.26 * Two objects A and B are approaching one another, traveling in opposite directions along the 
x axis of frame 8 with speeds v A and v B . At time t = 0, they are at positions x = 0 and x = d. Write 
down their positions for an arbitrary time t and show that they meet at time t = d/(v A + v B ). Notice 
that this implies that the relative velocity of the two objects is v A + v B as measured in the frame 8 in 
which they are both moving. This may seem surprising at first thought, since we can clearly choose 
values of v A and v B for which this relative velocity is larger than c. 32 

15.27 *** Frame S' travels at speed V x along the x axis of frame 8 (in the standard configuration). 
Frame 8" travels at speed V 2 along the x' axis of frame S' (also in the standard configuration). By 
applying the standard Lorentz transformation twice find the coordinates x", y", z", t" of any event in 
terms of x, y, z, t. Show that this transformation is in fact the standard Lorentz transformation with 
velocity V given by the relativistic “sum” of V, and V 2 . 

15.28*** The relativistic velocity-addition formula is the answer to the following question: If u is 
the velocity of an inertial observer B relative to an observer A, and v is the velocity of C relative to 
B, what is the velocity w of C relative to A? Let us denote the answer by w = “u + v.” In classical 
physics, this is just the ordinary vector sum of u and v; in relativity, it is given by the inverse of the 
velocity addition formulas (15.26) and (15.27) (at least for the case that u points along the x axis). 
Taking u = (w, 0, 0) and v = (0, v, 0), write down the components of “u + v” and also of “v + u.” 
[Be careful to distinguish between the y factors y u and y v pertaining to u and v.] Show that “u + v” 
“v + u,” but that the two vectors have equal magnitudes and differ only by a rotation about the z 
axis. This rotation is sometimes called the Wigner rotation and is the cause of the so-called Thomas 
precession, which has an important effect on the fine structure of atomic energy levels. 

section 15.8 Four-Dimensional Space-Time; Four-Vectors 

15.29 * (a) Find the 3 x 3 matrix R(60 that rotates three-dimensional space about the x 3 axis, so that 
e, rotates through angle 9 toward e 2 . (b) Show that [R(6>)] 2 = R(20), and interpret this result. 

15.30* The “angle” 0 introduced in connection with Equation (15.40) has several useful properties. 
For any speed v < c (with corresponding factors fi and y ) we can define 0 so that y = cosh 0. Defined 
in this way, 0 is called the rapidity corresponding to v. Prove that sinh <f> = f5y and that tanh 0 = /?. 

15.31 * Here is a handy property of the rapidity introduced in Problem 15.30: Suppose that observer 
B has rapidity 0, as measured by A and that C has rapidity 0 2 as measured by B (with both velocities 
along the x axis). That is, the speed of B relative to A has /3, = tanh 0, and so on. Prove that the rapidity 
of C as measured by A is just 0 = 0! + 0 2 . 


32 Nevertheless, this does not violate any principle of relativity. We shall see that no single object can have 
speed greater than c relative to any inertial frame, but there is nothing to prohibit two objects from having relative 
speed greater than c as measured in a frame where both objects are moving. 
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15.32 * In Section 15.8,1 claimed that the 4 x 4 matrix A R corresponding to a pure rotation has the 
block form (15.44). Verify this claim by writing out the separate components of the equation x' = A R x 
and showing that the spatial part (%> x 2 , x 3 ) is rotated, while x 4 is unchanged. 

15.33 ** (a) By exchanging x 1 and x 2 , write down the Lorentz transformation for a boost of velocity 
V along the x 2 axis and the corresponding 4x4 matrix A B2 . (b) Write down the 4 x 4 matrices A R+ 
and A r _ that represent rotations of the x x x 2 plane through ±n/2, with the angle of rotation measured 
counterclockwise, (c) Verify that A B2 = A R _A B1 A R+ , where A B1 is the standard boost along the x x 
axis, and interpret this result. 

15.34 *★ Let A b ( 0) denote the 4 x 4 matrix that gives a pure boost in the direction that makes an angle 
0 with the x, axis in the x x x 2 plane. Explain why this can be found as A B (0) = A R (—0)A B (O)A R (0), 
where A R (6>) denotes the matrix that rotates the x,x 2 plane through angle 6 and A B (0) is the standard 
boost along the x x axis. Use this result to find A B (0) and check your result by finding the motion of 
the spatial origin of the frame S as observed in S'. 

15.35 ** Prove the following useful result, called the zero-component theorem : Let q be a four-vector, 
and suppose that one component of q is found to be zero in all inertial frames. (For example, q 4 = 0 
in all frames.) Then all four components of q are zero in all frames. 

section 15.9 The Invariant Scalar Product 

15.36 * We have seen that the scalar product x • x of any four-vector x with itself is invariant under 
Lorentz transformations. Use the invariance of x • x to prove that the scalar product x • y of any two 
four-vectors x and y is likewise invariant. 

15.37 * Verify directly that x' • y' — x • y for any two four-vectors x and y, where x' and y' are related 
to x and y by the standard Lorentz boost along the x> axis. 

15.38 ★* As an observer moves through space with position x(f), the four-vector (x(t), ct) traces a 
path through space-time called the observer’s world line. Consider two events that occur at points P 
and Q in space-time. Show that if, as measured by the observer, the two events occur at the same 
time t, then the line joining P and Q is orthogonal to the observer’s world line at the time t\ that is, 
(x P — Xq) -dx = 0, where dx joins two neighboring points on the world line at times t and t + dt. 

section i5.io The Light Cone 

15.39 ★ Suppose that a point P in space-time with coordinates x = (x, x 4 ) lies inside the backward 
light cone as seen in frame S. This means that x *x < 0 and x 4 < 0 at least in frame S. Prove that 
these two conditions are satisfied in all frames. Since this means that all observers agree that t < 0, this 
justifies calling the inside of the backward light cone the absolute past. 

15.40 ★ Show that the statement that a point x in space-time lies on the forward light cone is Lorentz 
invariant. 

15.41 * In the proposition on page 627, it is obvious that at least one of the three statements has to be 
true. In the proof given there, I showed that if statement (1) is true, then so are statements (2) and (3). 
To complete the proof, show that (2) implies (1) and (3). [Strictly speaking you should also check that 
(3) implies (1) or (2), but this is so similar to the argument already given that you needn’t bother.] 

15.42 * Prove that if x is time-like and x • y — 0, then y is space-like. 
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15.43 * (a) Show that if a body has speed v < c in one inertial frame, then v < c in all frames. 
[Hint: Consider the displacement four-vector dx = ( d\,cdt ), where dx is the three-dimensional 
displacement in a short time dt .] (b) Show similarly that if a signal (such as a pulse of light) has 
speed c in one frame, its speed is c in all frames. 

15.44 ★* (a) Show that if q is time-like, there is a frame S' in which it has the form q' = (0, 0, 0, q' A ). 
(b) Show that if q is forward time-like in one frame S, then it is forward time-like in all inertial frames. 

section i5.li The Quotient Rule and Doppler Effect 

15.45 * The quotient rule derived at the start of Section 15.11 is only one of several similar quotient 
rules. Here is another. Suppose that k and x are both known to be four-vectors and that in every inertial 
frame k is a multiple of x. That is, k — Xx in frame S, and k! = X'x' in frame S', and so on. Then the 
factor X (the “quotient” of k and x) is in fact a four-scalar with the same value in all frames, X = X'. 
Prove this quotient rule. 

15.46 ★ (a) Show that in the case that the source is approaching the observer head on, the Doppler 
formula (15.64) can be rewritten as co — eu 0 %/(l + j8)/(l — f). (b) What is the corresponding result 
for the case that the source is moving directly away from the observer? 

15.47 ★ Consider the tale of the physicist who is ticketed for running a red light and argues that because 
he was approaching the intersection, the red light was Doppler shifted and appeared green. How fast 
would he have to have been going? (A red 650 nm and A green 530 nm.) 

15.48 ** The factor y in the Doppler formula (15.64), which can be ascribed to time dilation, means 
that even when 0 = 90° there is a Doppler shift. (In classical physics there is no Doppler shift when 
0 = 90° and the source has zero velocity in the direction of the observer.) This transverse Doppler shift 
is therefore a test of time dilation, and has yielded some very accurate tests of the theory. However, 
except when the source is moving very close to the speed of light, the transverse shift is quite small, 
(a) If V = 0.2c, what is the percentage shift when 6 = 90° ? (b) Compare this with the shift when the 
source approaches the observer head-on. 

section 15.12 Mass, Four-Velocity, and Four-Momentum 

15.49 ★ Show that the four-velocity of any object has invariant length squared u • u = — c 2 . 

15.50 * For any two objects a and b, show that the scalar product of their four-velocities is u a • u b = 
—c 2 y(u rel ), where y(v) denotes the usual y factor, y(v) = l/y/l — v 2 /c 2 , and u rel denotes the speed 
of a in the rest frame of b or vice versa. 

15.51 *★ (a) For the collision shown in Figure 15.11, verify that all four components of the total four- 
momentum p a + p b [with the individual momenta defined relativistically as in (15.68)] are conserved 
in the frame S of part (a), (b) In two lines or less, prove that total four-momentum is conserved in the 
frame S' of part (b). [This problem does not, of course, prove that the law of conservation of four- 
momentum is generally true, but it does at least show that the law is consistent with the collision of 
Figure 15.11.] 

15.52 ** (a) Suppose that the total three-momentum P = p of an isolated system is conserved in 
all inertial frames. Show that if this is true (which it is), then the fourth component P 4 of the total 
four-momentum P = (P, P A ) has to be conserved as well, (b) Using the zero-component theorem of 
Problem 15.35, you can prove the following stronger result very quickly: If any one component of the 
total four-momentum P is conserved in all frames, then all four components are conserved. 
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15.53 ** For any two objects a and b, show that 


Pa-Pb = m a E b = m b E a = m a m b c 2 y (u rel ) 


where m a is the mass of a , and E b is the energy of b in a’s rest frame, and vice versa, and t> rel is the 
speed of a in the rest frame of b (or vice versa). 

15.54 *** (a) Using the correct relativistic velocity-addition formulas make a table showing the four 
velocities as seen in frame S' of the collision of Figure 15.11(b) in terms of the initial velocity of a 
in S. [Give the latter some simple name, such as v a = (£, r], 0).] (b) Add a column showing the total 
classical momentum m a v' a + m b v' h before and after the collision, and show that the y component of 
the classical momentum is not conserved in S'. 

15.55 *** Since the four-velocity u = y (v, c) is a four-vector its transformation properties are simple. 
Write down the standard Lorentz boost for all four components of u. Use these to deduce the relativistic 
velocity-addition formula for v. 

section 15.13 Energy, the Fourth Component of Momentum 

15.56 * When oxygen combines with hydrogen in the reaction (15.80) about 5 eV of energy is released 
(that is, the kinetic energy of the two final molecules is 5 eV more than that of the initial three molecules), 
(a) By how much does the total rest mass of the molecules change? (b) What is the fractional change in 
total mass? (c) If one were to form 10 grams of water by this reaction, what would be the total change 
in mass? 

15.57 ★ When a radioactive nucleus of astatine 215 decays at rest, the whole atom is torn into two in 
the reaction 


215 At % 21l Bi + 4 He. 

The masses of the three atoms are (in order) 214.9986, 210.9873, and 4.0026, all in atomic mass units. 
(1 atomic mass unit = 1.66 x 10 -27 kg = 931.5 MeV/c 2 .) What is the total kinetic energy of the two 
outcoming atoms, in joules and in MeV? 

15.58 * (a) What is a particle’s speed if its kinetic energy T is equal to its rest energy? (b) What if its 
energy E is equal to n times its rest energy? 

15.59 ★ If one defines a variable mass m var = ym, then the relativistic momentum p = ym\ becomes 
m var v which looks more like the classical definition. Show, however, that the relativistic kinetic energy 
is not equal to jm var t) 2 . 

15.60* A particle of mass m a decays at rest into two identical particles each of mass m b . Use 
conservation of momentum and energy to find the speed of the outgoing particles. 

15.61 * A particle of mass 3 MeV/c 2 has momentum 4 MeV/c. What are its energy (in MeV) and speed 
(in units of c)? 

15.62* A particle of mass 12 MeV/c 2 has a kinetic energy of 1 MeV. What are its momentum (in 
MeV/c) and its speed (in units of c)? 

15.63 * (a) What is a mass of 1 MeV/c 2 in kilograms? (b) What is a momentum of 1 MeV/c in kg-m/s? 
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15.64 ★ As measured in the inertial frame S, a proton has four-momentum p. Also as measured in §, 
an observer at rest in a frame S' has four-velocity u. Show that the proton’s energy, as measured by this 
observer, is — u • p. 

15.65 ** The relativistic kinetic energy of a particle is T — (y — l)mc 2 . Use the binomial series to 
express T as a series in powers of = v/c. (a) Verify that the first term is just the nonrelativistic kinetic 
energy, and show that to lowest order in fi the difference between the relativistic and nonrelativistic 
kinetic energies is 3f3 4 mc 2 /&. (b) Use this result to find the maximum speed at which the nonrelativistic 
value is within 1% of the correct relativistic value. 

15.66 ** In nonrelativistic mechanics, the energy contains an arbitrary additive constant — no physics 
is changed by the replacement £'->£' + constant. Show that this is not the case in relativistic mechanics. 
[Hint: Remember that the four-momentum p is supposed to tranform like a four-vector.] 

SECTION 15.14 Collisions 

15.67 * Two balls of equal masses (m each) approach one another head-on with equal but opposite 
velocities of magnitude 0.8c. Their collision is perfectly inelastic, so they stick together and form a 
single body of mass M. What is the velocity of the final body and what is its mass Ml 

15.68 * Consider the elastic, head-on collision of Example 15.10, in which two particles (masses m a 
and m b ) approach one another traveling along the x axis, collide, and emerge traveling along the same 
axis. In the CM frame (by its definition) p™ =• — pj". Use conservation of momentum and energy to 
prove that p® n = — p“; that is, the momentum of particle a (and likewise b ) just reverses itself in the 
CM frame. 

15.69 * (a) Show that the four-momentum of any material particle (m > 0) is forward time-like, 
(b) Show that the sum of any two forward time-like vectors is itself forward time-like, and hence 
that the sum of any number of forward time-like vectors is itself forward time-like. 

15.70 * (a) Use the results of Problem 15.69 to prove that for any number of material particles there 

exists a CM frame, that is, a frame in which the total three-momentum is zero, (b) Relative to an 
arbitrary frame S, show that the velocity of the CM frame is given by ^ pc/ E. 

15.71 ★ One way to create exotic heavy particles is to arrange a collision between two lighter particles 

a + b-^-d + e -1-|- g 

where d is the heavy particle of interest and e, • • •, g are other possible particles produced in the 
reaction. (A good example of such a process is the production of the ijs particle in the process 
e + + e~ -4- t/c, in which there are no other particles e, ■ ■ ■ , g.) (a) Assuming that m d is much heavier 
that any of the other particles, show that the minimum (or threshold) energy to produce this reaction in 
the CM frame is E cm ~ m d c 2 . (b) Show that the threshold energy to produce the same reaction in the 
lab frame, where the particle b is initially at rest, is E lab ~ m^c 2 jlm b . (c) Calculate these two energies 
for the process e + + e~ —> iff, with m e « 0.5 MeV/c 2 and M 3100 MeV/c 2 . Your answers should 
explain why particle physicists go to the trouble and expense of building colliding-beam experiments. 

15.72 * A mad physicist claims to have observed the decay of a particle of mass M into two identical 
particles of mass m, with M < 2m. In response to the objections that this violates conservation of 
energy, he replies that if M was traveling fast enough it could easily have energy greater than 2mc 2 and 
hence could decay into the two particles of mass m. Show that he is wrong. [He has forgotten that both 
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energy and momentum are conserved. You can analyse this problem in terms of these two conservation 
laws, but it is much simpler to go to the rest frame of M.] 

15.73** Consider the head-on elastic collision of Example 15.10, in which the final velocity \ b of 
particle b is given by (15.95). (a) Show that, in the special case that the masses are equal ( m a = m b ), 
v b = v a , the initial velocity of particle a. Show that in this case the final velocity of a is zero. [This result 
for equal-mass collisions is well known in classical mechanics; you have now shown that it extends 
to relativity.] (b) Show that in the nonrelativistic limit (15.95) reduces to v b = 2v a m a /(m a + m h ). By 
doing the necessary nonrelativistic calculations, show that this agrees with the nonrelativistic answer 
for elastic head-on collisions. 

15.74** A particle a traveling along the positive x axis of frame S with speed 0.5c decays into 
two identical particles, a -> b + b, both of which continue to travel on the x axis, (a) Given that 
m a = 2.5 m b , find the speed of either b particle in the rest frame of particle a. (b) By making the 
necessary transformation on the result of part (a), find the velocities of the two b particles in the original 
frame S. 

15.75 ** A particle of unknown mass M decays into two particles of known masses m a = 0.5 GeV/c 2 
and m b = 1.0 GeV/c 2 , whose momenta are measured to be p a = 2.0 GeV/c along the x 2 axis and 
p h = 1.5 GeV/c along the x, axis. (1 GeV = 10 9 eV.) Find the unknown mass M and its speed. 

15.76 ** Particle a is pursuing particle b along the x } axis of a frame S. The two masses are m a and 
m b and the speeds are v a and v b (with v a > v b ). When a catches up with b they collide and coalesce to 
form a single particle of mass m and speed v. Show that 

m 2 = m 2 + m 2 + 2m a m b y(v a )y(v b )(\ - v a v b /c 2 ) 


and find v. 

15.77 *** Consider the elastic head-on collision of Example 15.10, in which particle a collides with a 
stationary particle b. Assuming that m a ^ m b , show that the final kinetic energy of particle a satisfies 
T fin < (m a — m b ) 2 c 2 /2m h . [Hint: Look at the CM frame where you can show that the four-vector 
pf 1 — p'l' is time-like, so that (p^ n — pf) 2 < 0.] (b) The result of part (a) implies that if T™ is 
large, almost all this incoming energy is lost to b. This is quite different from the nonrelativistic 
situation. Prove that in nonrelativistic mechanics the proportion of kinetic energy retained by a is 
fixed, independent of T™. Specifically, T^ n = T™(tn a — m h ) 2 /(m a + m b ) 2 . 

15.78 *** Consider the elastic collision shown in Figure 15.17. In the lab frame S, particle b is initially 
at rest; particle a enters with four-momentum p a and scatters through an angle 9 ; particle b recoils at an 
angle i/r. In the CM frame S', the two particles approach and emerge with equal and opposite momenta, 
and particle a scatters through an angle 9'. (a) Show that the velocity of the CM frame relative to the 
lab frame is Y = p a c 2 /(E a + m b c 2 ). (b) By transforming the final momentum of a back from the CM 
to the lab frame, show that 


tan# = 


sin#' 

y v (cos9' + V/v ' a ) 


(15.153) 


where v' a is the speed of a in the CM frame, (c) Show that in the limit that all speeds are much smaller 
than c, this result agrees with the nonrelativistic result (14.53) (where A = m a /m b ). (d) Specialize now 
to the case that m a = m b . Show that, in this case, V/V = 1, and find a formula like (15.153) for tan ^. 
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Figure 15.17 Problem 15.78. 


(e) Show that the angle between the two outgoing momenta is given by tan(0 + xjr) — 2/{fi^y v sin 00 - 
Show that in the limit that V «c, you recover the well-known nonrelativistic result that 9 + if/ = 90°. 

section 15.15 Force in Relativity 

15.79 * Consider an object of mass m (which you may assume is constant), acted on by a force F. 
From the definition (15.100) prove that 

F = yma + (F • v)v/c 2 , 

where a = dv/dt is the object’s acceleration. Notice that it is certainly not true in relativity that F = ma. 
Nor is it true that F = m var a, where m yai is the variable mass m var = ym, except in the special case 
that F happens to be perpendicular to v. In general, F and a are not even in the same direction. 

15.80 * A particle of mass m and charge q moves in a uniform, constant magnetic field B. Show that 
if v is perpendicular to B, the particle moves in a circle of radius 

r = \p/qB\. (15.154) 

[This result agrees with the nonrelativistic result (2.81), except that p is now the relativistic momentum 
p = ymv.J 

15.81 * An electron (mass 0.5 MeV/c 2 ) moves with speed 0.7c in a circular path in a magnetic field 
of 0.02 teslas. Using the relativistic result (15.154) of Problem 15.80, find the radius of the electron’s 
orbit. What would your answer have been if you used the classical definition of momentum? [Needless 
to say, the relativistic result is confirmed by experiment, and this gave some of the first evidence of the 
correctness of relativistic mechanics. 33 ] 

15.82 * (a) Verify the result (15.106) for the position of a particle moving in a uniform electric field, 
by integrating the expression (15.105). (b) When t is small, the particle should be moving slowly and 


33 In a paper in Gottingen Nachrichten, p. 143 (1901), Walter Kaufmann showed that the electron’s “apparent 
mass” (what we would call its variable mass) in a magnetic field seemed to increase with speed in rough accord 
with the relativistic formula m var = y(v)m. Notice that this predated Einstein’s first paper on relativity by some 
four years. 
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(15.106) should agree with the nonrelativistic result x = ±a t 2 . Verify that it does, (c) Show that when 
t is large, x ~ F(cf + const) and explain this result. 

15.83 * Starting from the definition (15.100) of the force F on an object, prove that the transformation 
of the components of F as we pass from a frame 8 to a second frame §', traveling at speed V in the 
standard configuration relative to 8, is 


F , _ F , F 2 

1 i -M/c ’ 2 k(1-Nc)’ 


F' = 


F 3 

y(l - j6u,/c) 


(15.155) 


where fi = f$(V) and y = y(V) relate to the relative speed of the two frames and v is the velocity of 
the object as measured in 8. 


15.84 ** A mass m is thrown from the origin at t = 0 with initial three-momentum p 0 in the y direction. 
If it is subject to a constant force F 0 in the jc direction, find its velocity v as a function of t, and by 
integrating v find its trajectory. Check that in the nonrelativistic limit the trajectory is the expected 
parabola. 

15.85 ** We have seen that there are processes in which the mass of an object varies with time, 
(a) Starting from (15.85), prove that dm/dt 0 = — u • K/c 2 , where t 0 is the object’s proper time, u is its 
four-velocity, and K is the four-force on the object, (b) This means that the necessary and sufficient 
condition that a force doesn’t change an object’s mass is that u • K = 0. It is an experimental fact that 
if a charged particle is at rest in an electromagnetic field (even instantaneously) then dEjdt = 0. Use 
this to argue that electromagnetic forces do not cause a particle’s mass to change. 


section 15.16 Massless Particles; the Photon 

15.86 * The neutral pion 7r° is an unstable particle (mass m = 135 MeV/c 2 ) that can decay into two 
photons, 7T° —> y + y . (a) If the pion is at rest, what is the energy of each photon? (b) Suppose instead 
that the pion is traveling along the jc axis and that the photons are observed also traveling along the x 
axis, one forward and one backward. If the first photon has three times the energy of the second, what 
was the pion’s original speed u? 

15.87 * A neutral pion (Problem 15.86) is traveling with speed v when it decays into two photons, 
which are seen to emerge at equal angles 9 on either side of the original velocity. Show that v = c cos 9. 

15.88 * Two particles a and b with masses m a = 0 and m b > 0 approach one another. Prove that they 
have a CM frame (that is, a frame in which their total three-momentum is zero). [Hint: As you should 
explain, this is equivalent to showing that the sum of two four-vectors, one of which is forward light-like 
and one forward time-like, is itself forward time-like.] 

15.89 * Show that any two zero-mass particles have a CM frame, provided their three-momenta are 
not parallel. [Hint: As you should explain, this is equivalent to showing that the sum of two forward 
light-like vectors is forward time-like, unless the spatial parts are parallel.] 

15.90 ★* The first positrons to be observed were created in electron-positron pairs by high-energy 
cosmic-ray photons in the upper atmosphere, (a) Show that an isolated photon cannot convert to 
an electron-positron pair in the process y —► e + + e~. [Show that this process inevitably violates 
conservation of four-momentum.] (b) What actually occurs is that a photon collides with a stationary 
nucleus with the result 


y + nucleus -> e + + e + nucleus. 
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Convince yourself that the formula (15.98) can be used to find the minimum energy for a photon to 
induce this reaction. [The derivation of (15.98) assumed that the incident particle had m > 0.] Show 
that, provided the mass of the nucleus is much greater than that of the electron, the minimum photon 
energy to induce this reaction is approximately 2m e c 2 . [This is exactly the energy one would have 
calculated for the process y -> e + + e _ and shows that the role of the nucleus is just as a “catalyst” 
that can absorb some three-momentum.] 

15.91 *★ An excited state X* of an atom at rest drops to its ground state X by emitting a photon. In 
atomic physics it is usual to assume that the energy E y of the photon is equal to the difference in energies 
of the two atomic states, A E = ( M * — M)c 2 , where M and M* are the rest masses of the ground and 
excited states of the atom. This cannot be exactly true, since the recoiling atom X must carry away 
some of the energy A E. Show that in fact E y = A£[l — AE/(2M*c 2 )]. Given that A E is of order a 
few eV, while the lightest atom has M of order 1 GeV/c 2 , discuss the validity of the assumption that 
E y - AE. 

15.92 ** A positive pion decays at rest into a muon and neutrino, n + n + + v. The masses involved 
are - 140 MeV/c 2 , — 106 MeV/c 2 , and m v = 0. (There is now convincing evidence that m v is 
not exactly zero, but it is small enough that you can take it to be zero for this problem.) Show that the 
speed of the outgoing muon has ft = (m 2 — m 2 )/(m 2 + m 2 ). Evaluate this numerically. Do the same 
for the much rarer decay mode tt + -* e + + v, (m e = 0.5 MeV/c 2 ). 

15.93 *** Consider a head-on elastic collision between a high-energy electron (energy E 0 and speed 
fi 0 c) and a photon of energy E yo . Show that the final energy E y of the photon is 

£ _ £ _ 1 + fl, _ 

°2 + (l -p o )E 0 /E yo 

[Hint: Use (15.123).] Show that E y < E 0 , but that if —»■ 1, then E y /E 0 1; that is, a very high- 
energy electron loses almost all its energy to the photon in a head-on collision. What fraction of its 
original energy would the electron retain if E 0 10 TeV and the photon was in the visible range, 
E y0 ~ 3 eV? (Remember that the mass of the electron is about 0.5 MeV/c 2 ; 1 TeV = 10 12 eV.) 

section 15.17 Tensors 

15.94 * Prove that for any two matrices A and B, where A has as many columns as B has rows, the 
transpose of A B satisfies (ABJ = BA. 

15.95 * By making suitable choices for the n-dimensional vectors a and b, show that if aCb = aDb 
for any choices of a and b (where C and D are n x n matrices), then C = D. 

15.96 * Prove that if T and a are respectively a four-tensor and a four-vector, then b = T-a = TGa is 
a four-vector; that is, it transforms according to the rule b' = A b. 

15.97 * (a) A tensor T is said to be symmetric if T^ v = T v(l . Prove that if T is symmetric in one inertial 
frame, then it is symmetric in all inertial frames, (b) T is antisymmetric if T fxv = —T vll . Prove that if 
T is antisymmetric in one inertial frame, then it is antisymmetric in all inertial frames. (An example 
of the latter property is the electromagnetic field tensor, which is antisymmetric in all frames.) 

15.98** (a) Use the invariance of the scalar product a - b = aGb to prove that the 4 x 4 Lorentz 
transformation matrices A must satisfy the condition (15.136), AG A = G. (b) Verify that the standard 
Lorentz boost (15.43) does satisfy this condition. 
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15.99 ** A useful form of the quotient rule for three-dimensional vectors is this: Suppose that a and b 
are known to be three-vectors and suppose that for every orthogonal set of axes there is a 3 x 3 matrix 
T with the property that b = Ta for every choice of a, then T is a tensor, (a) Prove this, (b) State and 
prove the corresponding rule for four-vectors and four-tensors. 

15.100*** (a) The statement that V is a vector operator means that if 0(x) is any scalar, then 
the three components of V0 = (dcp/dx^ d(f>/dx 2 , d<p/dx 3 ) transform according to the three-vector 
transformation law (15.126). Prove this last statement. [Hint: Remember the chain rule, that 30/3x ( m 
'^ j j(dx'j/dx i )d < f)/dx'j.\ (b) Prove that in four-dimensional space-time, if 0 is any four-scalar, the 
quantity D0 defined with the components 


/90 

30 

30 

30 

Vax/ 

dx 2 ’ 

3x 3 ’ 

dx 4/ 


(15.156) 


(note well the minus sign on the fourth component) is a four-vector. This result is crucial in writing 
down Maxwell’s equations for the electromagnetic field. 


section 15.18 Electrodynamics and Relativity 

15.101 * (a) Prove that E • B and E 2 — c 2 B 2 are both invariant under any Lorentz tranformation. 
[Use the transformation equations (15.146) to prove the required results for the standard boost and 
then explain why if either quantity is invariant under the standard boost then it is invariant under any 
Lorentz transformation.] Use these results to prove the following two propositions: (b) If E and B are 
perpendicular in frame S, then they are perpendicular in any other frame S', and (c) if E > cB in a 
frame S, then there cannot exist a frame in which E = 0. 

15.102* (a) Starting from the transformation equations (15.146) for the standard boost along the x { 
axis, find the corresponding boost along the x 3 axis, (b) Write down the inverse of this transformation 
and then verify the results (15.151) and (15.152) for the fields of a moving line charge. 

15.103* (a) Using the transformation equations (15.146), show that if E = 0 in frame S, then E' = 
vxB'in S', (b) Similarly, show that if B = 0 in frame S, then B' = —v x E'/c 2 . 

15.104 ★* We defined the electromagnetic field tensor by the equation K = qJ-u = qJGu, where K 
is the four-force on a charge q and u is its four-velocity, (a) Starting from the Lorentz force (15.139) 
write down the four components of K [as in (15.141)]. (b) Use these to find the matrix TG and show 
that the tensor 5F has the form claimed in (15.143). 

15.105** Since T is a four-tensor it has to transform according to the rule (15.145), $' = ATA. 
Using the form (15.143) for 3 and the standard Lorentz boost for A, find the matrix T' and verify the 
transformation equations (15.146) for the electromagnetic fields. 

15.106 ** Derive the Lorentz-force law from Coulomb’s law as follows: (a) If a charge q is at rest 
in frame S', then Coulomb’s law tells us that the force on q is F' = qE'. Use the inverse of the force 
transformation (15.155) in Problem 15.83 to write down the force F as seen in S. (Answer in terms of 
E' for now.) (b) Now use the field transformation (15.146) to rewrite your answer in terms of E and B 
and show that F = q(E + v x B). 

15.107 *★ It is a result well known in classical electromagnetism that one can introduce a three-scalar 
potential 0 and a three-vector potential A such that the fields E and B can be written as 

a a 

E = -V0-and B = V x A. (15.157) 
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In relativity these potentials are combined to form a single four-potential A = (A, 4>/c). Prove that 


= q u a v - □ 


where □ is the four-dimensional gradient operator defined in (15.156) of Problem 15.100. (If we accept 
that A really is a four-vector, this gives an alternative proof that J is a four tensor.) 

15.108 ** Consider an electric charge distribution, with charge density g, moving with velocity v 
relative to a frame §. (a) Show that g = yg 0 , where q q is the charge density in the rest frame. (Notice that 
v can vary with position, so different parts of the distribution will have different rest frames, but that’s 
all right.) (b) The three-current density is defined as J = gv. Show that the four-current density, defined 
as 7 = (J, eg), is a four-vector, (c) It is a well-known result in electromagnetism that conservation of 
charge implies the so-called equation of continuity, V • J + dg/dt = 0 (where V • J = J] 3 7 ( /3x ; - is the 
so-called divergence of J). Show that this condition is equivalent to the manifestly invariant condition 
□ •7 = 0, where □ is the four-dimensional gradient defined in (15.156) of Problem 15.100. 

15.109 Two equal charges q are moving side-by-side in the positive x direction in frame S. The 
distance between them is r and their speed is v. Find the force on either charge due to the other in two 
ways: (a) Find the force in their rest frame S' and transform back to S, using the force transformation 
(15.155) of Problem 15.83. Note that the force in S is less than in the rest frame, (b) Find the electric 
and magnetic fields in S' and thence in S, using the field transformation (15.146). Use these fields (in S) 
to write down the Lorentz force on either charge in S. Note that in S there is an attractive electric force 
and a repulsive magnetic force. As /3 -* 1 they become nearly equal and their resultant approaches 
zero. 


15.110 A charge q is moving with constant speed v along the x axis of frame S with position vtx. 
(a) Write down the electric and magnetic fields in the charge’s rest frame S', (b) Use the inverse of 
the field transformation (15.146) to write down the electric field in the original frame S. [In the first 
instance, you will find E in terms of the primed variables x', y', z', t', but you can use the standard 
Lorentz transformation to eliminate them in favor of x, y, z, t.} Show that the field at position r and 
time t is 


kq( l~l 2 ) R 

(l-^ 2 sin 2 0)3/2 R 2 


(15.158) 


where R = r — vtx is the vector pointing from the charge’s position to the point of observation r, and 
9 is the angle between R and the x axis, (c) Sketch the behavior of the field strength as a function of 9 
for fixed R, and make a sketch of the electric field lines at one fixed time t. 


15.111 Two of Maxwell’s four equations read 

VxB-4— =MoJ and V-E=-£? (15.159) 

c 2 3 t e 0 


where J and g are the current and charge densities that gave rise to the fields. Show that these 
two equations can be written as the single four-vector equation □ • T = —/z 0 7, where □ is the four¬ 
dimensional gradient operator introduced in Problem 15.100, 7 is the four-current (J, eg), and the 
scalar product □ • T = DGT. 



CHAPTER 


Continuum Mechanics 


We can divide classical mechanics into three main areas, in order of increasing 
complexity. (1) The mechanics of point masses. Occasionally, these point masses 
are elementary particles, such as the electron, whose mass is (as far as we know) 
concentrated at a point; but usually they are extended objects whose mass is certainly 
not localized at a point but which, for the purposes at hand, can be approximated 
as if it were. Thus in treating the flight of baseballs, or finding the orbits of the 
planets, it is an often an excellent approximation to treat them as point masses. In 
this case, the configuration of any system is given by a finite set of coordinates, three 
for each point mass. (2) The mechanics of rigid bodies. Here we acknowledge that 
the mass of interest is spread out over some nonzero volume, but we assume that the 
relative positions of the various parts of any one body are fixed; that is, all bodies are 
rigid. As we saw, we have to allow for the possible rotational motion of such bodies, 
but the configuration of any system is still specified by a discrete and finite set of 
coordinates; for example, for a single rigid body we need just six coordinates, three 
for the CM position and three more for the body’s orientation. The notion of a rigid 
body is an idealization — all real bodies can be deformed — but for many systems it 
is a reasonable and extremely useful approximation. (3) Finally, there is continuum 
mechanics, in which we acknowledge that the mass in a system can be distributed 
over some region and that the relative positions of the various parts can change in 
a continuous, but otherwise arbitrary, manner. Clearly the motion of any fluid — the 
flow of air past an airplane wing or of water down a pipe — is a problem in continuum 
mechanics. But so also is the motion of a solid when the independent motions of its 
parts are important, as for example in the flexing of a heavily loaded steel beam or the 
vibration of the earth’s crust in response to an earthquake. In continuum mechanics a 
system comprises a continuously infinite number of parts, and the specification of its 
configuration requires an infinite number of coordinates. 

So far in this book we have treated only the first two topics, the mechanics of 
point masses and rigid bodies — what we could call discrete mechanics — with a 
discrete, finite number of coordinates. This final chapter is intended as the briefest 
of introductions to continuum mechanics. A thorough introduction would require 681 
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another book, but I hope that I can give at least a sense of some of the central ideas. 
Specifically, we’ll see how the passage from discrete to continuum mechanics changes 
the ordinary differential equations of the former to partial differential equations. We’ll 
see how these partial differential equations often lead to the wave equation, which 
governs the behavior of sound waves in liquids and gases, of seismic waves in the 
earth’s crust, and of many other waves, most notably electromagnetic waves such as 
light and microwaves. The first three sections deal with one-dimensional continuous 
systems, and then, in Section 16.4 we move out into three dimensions. Perhaps the 
biggest complication in three dimensions is that the forces and displacements involve 
two tensors, the stress and strain tensors. One of the main objectives of this chapter 
is to introduce these two important concepts. Section 16.7 introduces the stress tensor 
for fluids and solids, and Section 16.8 the strain tensor for solids. Section 16.9 gives 
the relation between the two tensors, the generalized Hooke’s law. Sections 16.10 
and 16.11 derive the equation of motion for an elastic solid and use it to analyze the 
longitudinal and transverse waves in a solid. The last two sections are a very brief 
introduction to the mechanics of inviscid fluids. In Section 16.12, we’ll derive the 
equation of motion and the so-called continuity equation, and in Section 16.13 we’ll 
use them to analyze the possible waves in a fluid. 

Before we get started, I should emphasize that the notion of a continuous distri¬ 
bution of matter is itself an idealization. The properties of the air flowing in a wind 
tunnel certainly appear to vary continuously and smoothly from place to place. For 
example, we are used to assuming that air has a density p(r) which gives the mass 
g(r)dV in a small volume dV, and, while g>(r) may certainly vary with r, we assume 
that it does so smoothly. However, we know perfectly well that, when viewed under a 
super-microscope that could resolve fractions of nanometers, the air would be seen to 
consist of individual molecules, and the density g(r) would vary wildly between large 
values near to each molecule and zero in the huge spaces in between. Fortunately, the 
scale of these wild variations is tiny compared to the scales of normal interest. For ex¬ 
ample, even if we were interested in regions as small as 1 mm 3 , this volume contains 
some 10 16 molecules. Thus the density p(r) that we actually work with is the density 
averaged over this huge number of molecules and does indeed vary smoothly with r. 
The idea that, at a scale of millimeters or more, matter can be treated as continuous, 
with parameters such as the density being averages over many molecules, is called the 
continuum hypothesis. The success of continuum mechanics is ample justification 
for this hypothesis, which we shall adopt throughout this chapter. 


16.1 Transverse Motion of a Taut String 


As our first example of a continuous system, let us consider a taut string lying along 
the x axis. In equilibrium, we assume that the string lies exactly on the x axis, but now 
suppose that it is undergoing a small motion (perhaps an oscillatory motion) in the y 
direction. One simple way to specify the string’s configuration at any one time is to 
give its displacement u(x) from the x axis, as shown in Figure 16.1(a). Specifically, 
at any one time, a small element of the string whose equilibrium position was at x on 
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(a) (b) 

Figure 16.1 (a) The position of a continuous string, at any one time, is specified 

by the function w(x) which gives the string’s displacement from its equilibrium 
position on the x axis, (b) A set of n point masses joined by massless strings has its 
configuration specified by the discrete set of displacements w, with i = !,•••, rt . 


the axis is now located a distance y = u(x) above the axis. This scheme needs little 
explanation, but is worth contrasting with a related discrete system, namely a set of n 
point masses m t , * • •, m„joined by a taut massless string, which lies in equilibrium on 
the x axis. If these masses are allowed to move in the y direction as in Figure 16.1(b), 
their configuration can be specified by their displacements u h - ■ • ,u n from the axis. 
Where the discrete system is specified by these n variables u h with i = 1 ,•••,«, our 
continuous system is specified by the continuous function u (x). The role of the discrete 
index i attached to m ( is now played by the continuous variable x in u(x). Where the 
index i specifies which of the n masses is at position y — u h the variable x specifies 
which of the infinitely many pieces of the string is at y — u(x). 

If the systems of Figure 16.1 are moving, the displacements u depend on the time t. 
In the discrete case they become uft) and in the continuous case we must write u(x,t), 
a function of two variables. In the discrete case, Newton’s second law becomes a set 
of ordinary differential equations for the uf t) (for example the coupled differential 
equations of Chapter 11). In the continuous case, Newton’s law becomes a partial 
differential equation for u(x,t), involving partial derivatives with respect to both x 
and t, as we now show. 

To explore the motion of the string, we shall apply Newton’s second law to a small 
segment A B of the string, between x and x + dx as shown in Figure 16.2. To simplify 
our discussion we shall ignore gravity, and we shall assume that the displacement 
u(x,t ) remains so small for all x and all t, that the string remains nearly parallel to 
the x axis. This guarantees that the string’s length is essentially unchanged and hence 
that the tension T remains the same for all x and all t. The net force on the segment 
AB is then just F net = Fj + F 2 , where Fj and F 2 are the tension forces due to the 
adjacent sections of string, as shown in Figure 16.2. If 0 denotes the angle between 
the string and the x axis, the x component of this net force is 

F" et = T cos(0 + d0) — T cos 0 

but, since 0 and 0 + d(f> are both small, both cosines are very close to 1 and F" et is 
negligible, consistent with our assumption that the motion is in the y direction only. 
On the other hand, the y component is certainly not negligible: 


F” et = T sin(0 + dcf) — T sin0 = T cos0d0. 


(16.1) 
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X 


x + dx 



Figure 16.2 The two forces on the small element AB of string are the 
tension forces F! and F 2 exerted by the adjacent sections of string. 


Since 0 is small we can replace cos 0 by 1, and we can write r/0 = {d(f)/dx)dx. [The 
derivative is a partial derivative since 0 = 0(x, t) depends on x and t.] Finally, again 
since 0 is small, 0 = du/dx, the slope of the string. Therefore, 


F net T 30 . T d 2 U 

F = T —ax = T — -ax. 
y dx dx 2 


(16.2) 


By Newton’s second law, F" et = ma y , where a y is the acceleration a y = d 2 u/dt 2 
and m is the mass of the segment AB, equal to /x dx if we use /x to denote the linear 
mass density of the string. Therefore, 


F net = fx — -dx. (16.3) 

y dt 2 

Equating the two expressions (16.2) and (16.3), we arrive at the equation of motion 
of our taut string: 


\ d 2 u ___ 1 d 1 u 

Ji 2 ~ C Jx 2 ' 


( 16 . 4 ) 


Here 1 have introduced the important constant 



where T is the tension in our string and /x is its linear mass density (mass/length). 

The equation of motion (16.4) is called the one-dimensional wave equation since 
its solutions are waves traveling along the string, as we shall see. As anticipated, it is a 
partial differential equation, involving derivatives with respect to x and t. The constant 
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c has the dimensions of speed (as you should check) and is the speed with which the 
waves travel. The wave equation (16.4) governs the motion of many different waves — 
waves on a string, waves of sound or light, seismic waves, and many more. Therefore, 
I shall give it a new section of its own. 


16.2 The Wave Equation 


We are going to show that there are just three kinds of solution of the wave equation 
(16.4): (1) a disturbance u(x,t) that travels rigidly along the string from left to right; 

(2) a disturbance u(x,t ) that travels rigidly along the string from right to left; and 

(3) any combination of these two. The proof of this claim is startlingly simple, although 
it depends on a trick that you probably wouldn’t think of right away. We change 
variables from x and t to 


f = x — ct and r] = x+ct. (16.6) 

It is a straightforward exercise (Problem 16.4) to show that 

(16.7) 

so, written in terms of the new variables, the wave equation (16.4) becomes simply 

(16.8) 


d 2 U c 2^ 2 W _ ^2 3 3 U 

3 1 2 dx 2 3 g dr)’ 


— —= 0 . 
d( dr, 


To solve this equation, let us temporarily write du/dr] — h, so that (16.8) becomes 
dh/dl; =0. This states simply that h does not depend on £, although it can, of course, 
depend on t]. Therefore, we can write h = h(rj), and we now have 


3 u 
dr] 


= h{rf). 


For any given value of £ we can integrate this equation to give u — f h(t])dr] + 
“constant,” where the “constant” may be different for different values of If we 
call this “constant” /(£) and set the integral f h(j]) dr] — g(r]), then we have proved 
that every solution of (16.8) must have the form 


u = m + g(ri). 


(16.9) 


By substituting into the left side of (16.8), you can see that a function of this form is 
a solution of (16.8) for any choice of the two functions /(c) and g(r]). Thus, (16.9) 
is the general solution of (16.8). 

Reverting to the original variables x and t, we have shown that the general solution 
of the wave equation (16.4) has the form 


u(x, t ) = f{x - ct) + g{x + ct) 


(16.10) 



686 


Chapter 16 Continuum Mechanics 


ct 



Figure 16.3 Motion of the wave (16.11). At time 0, the disturbance is given 
by u = /(x). At a later time t, it is given by u = f{x — ct), which has the 
same shape but has moved rigidly to the right by a distance ct. 


where / and g are any two functions. To see what these solutions represent, let us 
consider first the case that the function g = 0, so that our solution is just 

u(x,t) = f(x-ct). (16.11) 

What does this solution look like? Notice first that at time t — 0, the solution is 
u(x, 0) = /(x); that is, the function f(x) is just the disturbance at time t = 0. Figure 
16.3 shows a possible such function. The solid curve, with a large maximum at x = 0 
and a small dip on the left, shows the function /(x), the shape of the disturbance at 
t = 0. At a later time t, the disturbance is given by f(x — ct). Since /(x) has its 
maximum at x = 0, it follows that f(x — ct) has its maximum when x — ct = 0. 
Therefore, the maximum that was at x = 0 is now at x = ct. Since a similar argument 
applies to any point of the curve (for example, the minimum on the left), we conclude 
that the whole disturbance has moved bodily to the right by a distance ct. That is, the 
disturbance is a wave traveling rigidly to the right with speed c. 

A similar argument shows that a solution of the form u(x,t) = g(x + ct) repre¬ 
sents a wave traveling rigidly to the left with speed c, and the general solution (16.10) 
is a superposition of two waves, one traveling to the right and the other to the left. 
The functions / and g that appear in the general solution (16.10) are determined by 
the initial conditions of any particular problem. As you might guess, to determine a 
particular solution we need to specify the position u and the initial velocity u = du/dt 
at one initial time, as in the following example. 


example 16.1 Evolution of a Triangular Wave 

A short segment of a long taut string is pulled aside and released from rest at 
t = 0, so that its initial displacement is 

u(x, 0) = w 0 (x) (16.12) 

where u 0 (x) is the triangular function shown in Figure 16.4(a). Find the distur¬ 
bance u(x,t) for any later time t. 

The solution must have the form (16.10), where the two functions / and g 
are to be determined by the initial conditions. The given initial displacement 
(16.12) implies that 


/(x) + g(x) = w 0 (x). 


(16.13) 
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(a) 


(b) 



Figure 16.4 (a) The initial displacement of a string at t = 0 is given by 

the triangular function u 0 (x). (b) At any later time, the wave consists of 
two triangles, each half as high as the original, one traveling to the right 
and one to the left. 


This does not, by itself, determine / and g separately, and we must also look 
at the initial velocity. Differentiating (16.10) with respect to t, we see that the 
initial velocity of the string (in the y direction, of course) is 



v-#/'0r) + cg'(x) 


where the prime denotes differentiation of a function with respect to its ar¬ 
gument. In our case, the string is released from rest, so f'(x ) — g'(x) = 0. 
Integrating with respect to x, we conclude that 1 

f(x)-g(x) = 0. (16.14) 


Solving (16.13) and (16.14), we conclude that 

fix) ~ g(x) - ~u 0 (x) 

and the actual disturbance (16.10), at any time t, is 

u(x, t ) = f(x — ct ) + g(jc + ct ) = \u 0 {x — ct ) + ju 0 (x + ct). (16.15) 

Our original triangle has separated into two triangles, half as high, traveling 
outward in opposite directions with speed c, as shown in Figure 16.4(b). 

It is interesting to let the solution (16.15) evolve backward to times t < 0. 
At these times it represents two triangles approaching the origin from opposite 
sides. When t is close to 0, the two triangles meet and start to interfere. At 
t — 0 they overlap exactly and interfere to produce a triangular wave twice their 
individual heights, and then, as t increases past 0, they separate again and move 
apart as in Figure 16.4(b). 


'Strictly speaking there should be a constant of integration in (16.14), but as you can easily 
check, it cancels out of u = / + g, so we can just as well choose it to be zero. 
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An important special case of the solution (16.10) is the case that the functions / 
and g are sinusoidal. If g = 0, then the disturbance takes the form 

u(x, t) = A sin[£(x — ct )] = A sin(£x — cot ) (16.16) 

where A and k are arbitrary constants, and co = kc. This is a sinusoidal wave traveling 
to the right with amplitude A, wave number k (or wavelength X = 2n/k) and angular 
frequency co (or period r = 2j x/co). If we replace x — ct with x + ct, we obtain a 
similar sinusoidal wave 

u(x, t) = A sin[k(x + ct)] = A sin (kx + cot) (16.17) 

traveling to the left. The sum of these two solutions is itself a solution, 

u(x, t) = A sin(/cx — cot) + A sin(kx + cot) = 2 A sin(kx) cos (cot). (16.18) 

(Use the relevant trig identities to check this.) This solution has the remarkable 
property that it is not traveling at all (neither to right nor left). Instead it simply 
oscillates up and down like cos(<ut), with amplitude (at any one point x) equal to 
2A sin(fcx). In particular, at those points (the nodes) where kx is an integer multiple 
of 7r (kx = nn), the string does not move at all, as shown in Figure 16.5. We see 
that by superposing two carefully chosen traveling waves we have formed a standing 
wave. As we shall see in the next section, these standing waves play an important role 
in the oscillations of a finite length of string and are, in fact, the continuum analog of 
the normal modes of a system of coupled oscillators. 



Figure 16.5 The standing wave (16.18) at three successive times, t — 0 (solid 
curve), t = r/4 (long dashes), and t — r/2 (short dashes), where r is the period. 
The small dots on the x axis are the nodes, where kx = nn and the string does 
not move at all. Half way between any two successive nodes is an antinode, 
where the string oscillates up and down with maximum amplitude 2 A. 


16.3 Boundary Conditions; Waves on a Finite String* 


* As usual, sections marked with an asterisk can be omitted on a first reading. 

So far we have been assuming, implicitly, that our string is infinitely long, or at 
least that it is so long that we can ignore any effects of its ends. Real strings are, 
of course, finite in length and have ends. The motion of the string itself is governed 
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by the same wave equation (16.4) as before, but the existence of the ends imposes 
additional boundary conditions on its solutions. These boundary conditions vary 
with the nature of the string’s ends; for instance, the string may be tied down at an 
end, or it may simply flap freely, and the boundary conditions appropriate to these 
two situations are quite different. Here we shall consider just one type of boundary 
condition and one method of solving for it. Specifically, we’ll consider a string that 
is tied down at both ends (at x = 0 and x = L), and we shall solve this problem by a 
method analogous to our discussion of normal modes in Chapter 11. 


Normal Modes 

The problem that we have to solve is this: For 0 < x < L, the displacement u(x, t) of 
our string must satisfy the wave equation (16.4) 


d 2 u 2 d 2 u 
3 1 2 ~ C 3x 2 ’ 


(16.19) 


with initial conditions that fix the position u and velocity u at t = 0; in addition, it 
must satisfy the boundary conditions at x = 0 and x — L that 


u(0, t) = u(L, t) = 0 


(16.20) 


for all times t. Of the several ways to solve this problem, we shall follow the approach 
of Chapter 11; that is, we shall start by looking for solutions that vary sinusoidally in 
time, with the form 


u(x, t) — X(x)cos(&>t - 8), (16.21) 

where the function X (x) and the constants co and <5 are to be determined. As usual, 
there is nothing to stop us seeking solutions with this form, and, as before, we shall 
find that such solutions do exist and that any solution of our problem can be written 
in terms of them. 

Substitution of the assumed form (16.21) into the wave equation (16.19) reduces 
the latter to the form 

— co 2 X(x) cos (cot — 8) = c l d cos (cot — 3) 


or 


d 2 X(x) 

dx 2 


-k 2 X(x) 


(16.22) 


where 


k 


co 

c 


(16.23) 
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We see that the assumption of the sinusoidal time dependence (16.21) has reduced the 
partial differential equation (16.19) to the ordinary differential equation 2 (16.22) — 
an equation, moreover, whose solutions we can easily write down. 

The general solution of (16.22) is just 

X(x) = a cos (kx) + b sin(fcx) (16.24) 

and this yields a solution of the wave equation (16.19) for any choice of the constants 
a and b. However, we have still to satisfy the boundary conditions (16.20), which 
require that 


X(0) = X(L) = 0. (16.25) 

The condition that X(0) = 0 requires simply that the coefficient a in (16.24) be zero, 
so that X(x) = b sm(kx). Thus, the condition that X(L) = 0 requires that either b = 0 
or that sin (kL) = 0. In the former case, our solution is identically zero, and the string 
doesn’t move — a solution, but a trivial one. If sin(fcL) = 0, then kL must be an integer 
multiple of 7T, and we get a nontrivial solution, 

u(x, t) = sin(Cx)A cos(u4 - <5), (16.26) 

where the boundary conditions have forced k to have one of the values 

k — k n = n~j^ [n = 1,2,3, • • • ]. (16.27) 

By (16.23), co = ck, so the corresponding frequency co must have the form 

(t) = (0n=n J I£ [ n = 1,2,3, •••]. (16.28) 

We conclude that there are indeed solutions in which the string oscillates sinu¬ 
soidally with a single frequency co, provided co has one of the values (16.28). This 
result is reminiscent of Chapter 11, where we found that a system of n coupled os¬ 
cillators could oscillate in any of various sinusoidal normal modes with frequencies 
oo\, • • •, co n . The main difference is that the systems of Chapter 11 had a finite number 
of degrees of freedom and an equal number of normal frequencies. Here, our string has 
an infinite number of degrees of freedom and an infinite number of normal frequencies 
as in (16.28). Figure 16.6 shows the three normal modes of lowest frequency for our 
string — the fundamental and the first two overtones. If you compare these pictures 
with Figure 16.5, you will see that each of the normal modes of our finite string is 
just a section of a standing wave on an infinite string. That our finite string is fixed 
at its two ends means that the points x = 0 and x — L must be nodes, which requires 
that the length L must equal an integer number of half wavelengths, L = nX/2. Since 
X = 2Jt/k, this explains the condition (16.27) that k = nn/L. 

The allowed frequencies (16.28) of our string are all integer multiples of the lowest 
frequency, co n — noj { . They are, among many other things, the frequencies at which any 


2 Our method of solution here is closely related to the method of separation of variables. See 
Problem 16.9. 
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n = 2 



n = 3 



Figure 16.6 The three lowest-frequency normal modes (16.26) of a string 
of length L fixed at both ends. In each picture the solid curve, the long 
dashes, and the short dashes show the string at three successive times, 
separated by a quarter cycle. The n = 1 mode is called the fundamental. 


stringed musical instrument, such as a piano or guitar, can vibrate. The corresponding 
modes, including the fundamental, are called the harmonics of the string, since they 
“harmonize” well (that is, make a pleasing sound to the ear) when played together. 


The General Solution 

The normal modes (16.26) determine all possible motions of our finite string, in the 
sense that any possible motion can be expanded in terms of the normal-mode solutions. 
To see this we shall need to use some of the properties of Fourier series described in 
Section 5.7. First, let us note that any motion of the string is given by a function u(x,t) 
that satisfies the wave equation (16.19) and the boundary conditions (16.20), and is 
determined by its initial position u(x, 0) = u 0 (x) and velocity u(x, 0) = ii 0 (x). To see 
that any such solution can be expanded in terms of the normal modes, we first rewrite 
the normal-mode solution (16.26) in the “sine plus cosine” form as 

u(x, t ) = sin k n x(B n cos co n t + C n sino>„f). (16.29) 

Our claim is that any possible motion can be written as a linear combination of these 
normal-mode solutions: 

OO 

u(x,t ) = sin k n x(B n cos a) n t + C n sinco n t). (16.30) 

n =1 

To prove this, we note first that this linear combination certainly satisfies the wave 
equation and the boundary conditions that u vanish at x = 0 and x — L. At time t = 0, 
the claimed solution is just 


(x, 0) — ^ B n sin k n x. 


(16.31) 
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This is a Fourier sine series, and the coefficients B n can be chosen so the u (x , 0) 
matches any given initial value u Q (x ). 3 Similarly, the velocity of the proposed solution 
(16.30) is 

u(x, 0) = ^ o) n C n sin k n x (16.32) 

and we can choose the coefficients C n so that this is equal to any given initial velocity 
u 0 (x). We conclude that the proposed solution satisfies the equation of motion and the 
boundary conditions, and that by choice of the coefficients B n and C n can match any 
given initial conditions. Therefore, any possible motion of the string can be expanded 
in terms of the normal modes as in (16.30). 

Since all of the frequencies in (16.30) are integer multiples of the lowest frequency 
(co n = ncuj), every term in (16.30) is periodic with period x = 2tz/oo x . Therefore, every 
possible motion of our finite string is periodic with this period. [Of course, if certain 
coefficients in (16.30) are zero, the motion may be periodic with a shorter period as 
well, but every solution has the period of the fundamental mode.] 


example 16.2 A Triangular Wave on a Finite String 

A string of length L = 8 is fixed at both ends. It is given a small triangular 
displacement, as in Figure 16.7 and released from rest at t = 0. Find the Fourier 
coefficients B n and C n in the expansion (16.30) and using some reasonable finite 
number of terms to approximate the infinite series, plot the position of the string 
at five equally spaced times from t — 0 to t = r/2, where r is the period of the 
motion. 

Since the string is initially at rest, all of the coefficients C„ are zero. The 
coefficients B n are given by the integral 

B n — —( u Q (x) sin dx. (16.33) 

L Jo L 

[This is nearly, but not quite, the standard formula (5.84); for details, see 
Problem 16.13.] It is easy to see that this is zero when n is even. When n is 
odd, we can write n = 2m + 1 and you can check (Problem 16.10) that 


^2m+l 


, lV n 32 (. (2m + l)7r\ 

(-1) -—- 1 - cos- . 

(2m + l) 2 7r 2 V 8 / 


(16.34) 


Putting these coefficients into the expansion (16.30) and choosing some 
reasonable finite number of term, we can get a good approximation for the 
displacement u(x, t) for all times t. Using just the first five or so terms, we get a 
moderate approximation, but since it is very little trouble (for a computer) to use 
more terms, I chose to use the sum of the first twenty. The results are shown in 


3 There is a small complication that I am ignoring here. The series (16.31) is not the usual Fourier 
series since it is missing any cosine terms. However, it contains twice as many sine terms as the usual 
Fourier series since k n = tin/L (as opposed to the usual 2 nn/L), and one can prove that this series 
can be used to expand any (reasonable) function on the interval 0 < x < L. See Problem 16.13. 



Section 16.3 Boundary Conditions; Waves on a Finite String* 


693 



Figure 16.7 A string is released from rest at t — 0 in the triangular position shown. 


Figure 16.8. Each of these five plots deserves careful attention. The first shows 
the initial displacement of Figure 16.7, as approximated by the first 20 terms of 
its Fourier series. The approximation is very good, although it fails to reproduce 
the sharp turn at the apex. (Obviously, no finite sums of sines or cosines can 
actually have an instantaneous change of slope.) In the second picture, the initial 
triangle has split into two separate triangles, traveling in opposite directions. 
This is exactly the behavior we saw in Example 16.1 (Figure 16.4); because 
neither of the waves has reached the boundaries at x = 0 and L, the motion is so 
far unaffected by the presence of the boundaries. Skipping for a moment over 
the third picture, you can see in the fourth that each triangle has been reflected 
by the walls, and is now traveling back toward the center, although they are now 
inverted. In the third picture, the original and the reflected waves are both present 
and are interfering destructively to produce zero net displacement. Finally, in the 
last picture, the two reflected waves have coalesced momentarily into a single 
inverted triangle. If we were to follow the motion further, we would see the two 
reflected waves continue on until they hit the opposite walls and reflect again. 
(See Problem 16.11.) 


t = 0 


t = t/8 


t = t!A 


t = 3r/8 


t = r/2 



Figure 16.8 Five successive snapshots of a string that was released from the 
initial position of Figure 16.7, calculated using the first twenty nonzero terms of 
the Fourier series (16.30). The first picture shows the initial position (approxi¬ 
mated by 20 terms of its Fourier series). The four succeeding pictures show the 
position at intervals of r/8, where r is the period of the fundamental. 
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16.4 The Three-Dimensional Wave Equation 


We live in a three-dimensional world, and the wave equation (16.4) 

3 2 U 2 d 2 W 
3*2 -C ^2 


(16.35) 


needs to be generalized to three dimensions. It is not hard to guess what the proper gen¬ 
eralization should be. If p = p(x,y,z,t) = p( r, t) denotes some sort of disturbance 
in a three-dimensional system (for example, the pressure in a sound wave traveling 
through the air), then we would surely guess that the appropriate generalization of 
(16.35) should be 


2 f^P_ + ^P_ + d2 P \ 

3 t 2 \3x 2 dy 2 dz 2 J 


(16.36) 


We shall meet a couple of examples of disturbances that do indeed satisfy this three- 
dimensional wave equation later in this chapter. In particular, I shall prove in Section 
16.13 that the pressure 4 in any inviscid fluid (for example, air) is an example, for 
which the wave speed c is given by 


c = 



(16.37) 


where BM denotes the bulk modulus of the fluid, and q 0 is the equilibrium density. 
(I’ll define the bulk modulus shortly. For the moment, just see it as a parameter that 
characterizes the resistance of the fluid to compression.) 

It is usual to streamline the notation in the wave equation (16.36): If, as usual, we 
take the view that V is the “vector” with components 


V = 


(l_ A JL\ 

\3x’ 3y’ dz) 


then the scalar product of V with itself is obviously 


V 2 = V.V = (|-) 2 + (i-) 2 + (l) J . (16.38) 


You may well have met this differential operator before, perhaps in your study 
of electromagnetism. It plays a huge role in many subjects — electromagnetism, 
quantum mechanics, fluid mechanics, elasticity, thermodynamics, and more — and 


4 Strictly speaking the pressure p discussed throughout this section is the incremental pressure, 
the difference between the total pressure and the equilibrium atmospheric pressure. 
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is called the Laplacian for its role in Laplace’s equation of electrostatics. With this 
notation, we can rewrite the three-dimensional wave equation (16.36) as 




(16,39) 


Plane Waves 

The equation (16.39) has many, many solutions, of which the simplest are the so- 
called plane waves. A simple example of a plane wave is a solution of (16.39) [or 
(16.36)] which is independent of y and z, 

P( r, t) = p(x, t ). 

Obviously a disturbance with this form has the same value at all points in any plane 
x = constant. If we substitute this form into (16.36), the derivatives with respect to y 
and z drop out, and we are left with the one-dimensional wave equation 

<Pp_ =c 2<Pp_ 

d t 2 dx 2 

whose most general solution we already know to be p — fix — ct) + g(x + ct). In 
particular, a solution p — f(x — ct) is a plane disturbance (p constant in any plane 
perpendicular to the x axis) that is traveling in the x direction with speed c (hence the 
name “plane wave”). 

Similarly, a solution of the form p = f(y — ct) or /(z — ct) is a plane wave 
traveling in the y or z direction. More generally, if n denotes an arbitrary unit vector, 
then a disturbance of the form 


p = f(n-r — ct) (16.40) 

satisfies the wave equation (16.39), is constant in any plane perpendicular to n, and 
travels in the direction n with speed c. (See Problem 16.15.) If the function / is a 
sinusoidal function, /(g)' = cos kg, say, the wave (16.40) is the sinusoidal plane wave 

p = cos[k(n ■ r — ct)], (16.41) 

whose crests lie in planes perpendicular to n and travel with speed c in the direction of 
n, as shown in Figure 16.9. This kind of wave is easier to visualize in two dimensions. 
You could think, for example, of “plane” waves on the surface of a pond. 

A plane wave is a mathematical idealization that never occurs in practice, since no 
real disturbance can be constant over an infinite plane. Nevertheless, it is frequently a 
very useful approximation. The light shining on us from the sun is well approximated 
as a plane wave, as is the sound from a distant explosion. 
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Figure 16.9 The sinusoidal plane wave (16.41). The wave crests (or wavefronts) 
are planes perpendicular to the unit vector n and they travel with speed c in the 
direction of n. 


Spherical Waves 

Another important solution of the three-dimensional wave equation is a spherical 
wave, for example, a sound wave traveling radially outward from a small, omnidi¬ 
rectional loudspeaker. If we assume that such a wave is spherically symmetric, then 
it must have the form p = p(r,t). (That is, in spherical polar coordinates, p is inde¬ 
pendent of 0 and (p.) It is not hard to show that for a function of this form 

y 2 p = ~^ 1 (rp). (16.42) 

r dr A 

[The obvious way to prove this is to evaluate the left side using the definition (16.38) 
of V 2 ; the simplest is to look up the expression for V 2 in polar coordinates inside the 
back cover. See Problem 16.16.] Therefore, the wave equation becomes 


dt 2 


2 1 3 2 , , 


or, multiplying both sides by r, 


a 2 2 a 2 

W2 (rp)= c - 2 (r p ). 


(16.43) 


We see that for a spherical wave the function rp(r, t) satisfies the one-dimensional 
wave equation with respect to r and t. Therefore the general solution has the form 


rp(r, t ) = f{r - ct) + g(r + ct). 

In particular, if the function g is zero, the disturbance has the form 

p(r, t) = - fir — ct). (16.44) 

r 

The factor f{r — ct) represents a disturbance traveling rigidly outward. Since this is 
what one might have guessed for a radially spreading wave, the question is, “Why the 
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factor of 1/ r ?” To answer this we need a result that you may recall from an introductory 
physics class. The intensity of any wave in three dimensions is defined as the power 
delivered by the wave to unit area perpendicular to the direction of propagation, and 
the intensity of a sound wave is proportional to p 2 . (For guidance in proving this, see 
Problem 16.37.) Thus the factor 1/r in (16.44) implies that the intensity is proportional 
to 1/r 2 , which is exactly what is required for conservation of energy: At a distance r 
from the source, the energy of the wave is spread out over an area Anr 2 . Therefore, 
as the wave moves radially outward, the intensity has to fall off like 1/r 2 to keep the 
total power constant. 

If the function / in (16.44) is sinusoidal, /(£) = cos k£, say, then (16.44) repre¬ 
sents a sinusoidal wave whose crests are traveling radially outward from the origin 
with speed c, as illustrated in Figure 16.10. 



Figure 16.10 A spherical wave. The wave crests are spheres moving outward 
from the origin. To help visualize what this shows, it may help to think of it as 
a two-dimensional wave, such as the ripples created on a pond by a stick which 
moves in and out of the water at the origin. 


16.5 Volume and Surface Forces 


Our next objective is to see how to find the equation of motion of a three-dimensional 
continuous system. In general, this is a very complicated problem, and the details 
depend strongly on the precise nature of the system. For example, the equations of 
motion of a fluid are very different from those of an elastic solid. Nevertheless, there 
are some reasonably simple general principles that apply to many different continuous 
systems, and these are what I shall describe next. 

As you would probably guess, we find the equation of motion of a continuous body 
by applying Newton’s second law to an arbitrary small element dV of the body. (I use 
the word “body” here to mean any chunk of matter, solid, liquid, or gas.) This is exactly 
analogous to what we did in Section 16.1, where we applied Newton’s second law to 
a length dx of the one-dimensional string, but the three-dimensional case is naturally 
more complicated. We need first to discuss the geometry of the volume element dV 
and then the specification of the forces on, and resulting displacement of, dV. 
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Elements of Volume 


The shape of the volume element on which we focus is arbitrary. It could be spherical, 
or rectangular (in the shape of a brick), or anything else our whim dictates. For 
simplicity, you could have in mind a simple rectangular volume, as in Figure 16.11. 
The volume on which we focus will usually be an infinitesimal volume, as the label t/F 
is intended to suggest. The surface that bounds d V could be the actual boundary of the 
continuous body (for example, the walls of the cylinder containing a gas), but it will 
usually be an “imaginary” surface, that is, an arbitrary surface interior to the whole 
body. The whole bounding surface is naturally a closed surface that divides all of 
space into exactly two parts, the “inside” (namely, dV ) and the “outside” (everything 
else). This means that we can specify the orientation of any part of the surface, such 
as the face S in Figure 16.11, by the unit vector n, normal to the surface and pointing 
outward from dV. 5 



Figure 16.11 A small element of volume d V of a continuous body can have 
any shape, but a convenient choice is the rectangular shape shown here. The 
orientation of any part S of the surface (the right-hand end here) is specified 
by a unit vector n that points outward from dV. 


Forces on the Volume Element 

The two most important kinds of force on a volume element r/V of a continuous 
system are called volume forces and surface forces. An example of a volume force 
is the force of gravity, F = QgdV, where q is the mass density of the material and g 
is the acceleration of gravity. A second example is the electrostatic force F — gEdV 
of an electric field E on a material with charge density q. The definition of a volume 
force is simply that it is a force proportional to the volume dV. Volume forces are 
generally the result of an external field (such as gravity), and to be definite I shall 
usually assume that the only volume force is that of gravity. In any event, the body 
forces are almost always known and well understood. Therefore, our main concern is 
with the surface forces. 

A familiar example of a surface force is the force p dA of the pressure p in a fluid 
on a small surface element of area dA. The definition of a surface force is a force 
proportional to the area dA of the surface on which it acts. Surface forces are generally 


5 When one is concerned with a nonclosed surface, the orientation of a small part S can still be 
specified by a unit vector n that is normal to 5, but this leaves an ambiguous sign (since n and — n 
both fit the definition). Fortunately, in this chapter S will always be a part of a closed surface, so we 
can insist, simply and unambiguously, that n point outward. 
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(a) Pressure (b) Tension (c) Shear 


Figure 16.12 Three different surface forces on the face S' of a rectangular 
volume, (a) Hydrostatic pressure acts normally to the surface and inward, so 
that F = — pn dA, with the minus sign since F is inward whereas n is outward, 
(b) A simple tension in the direction normal to S. (c) By definition, a shearing 
force acts tangentially to S. 



dx 


Figure 16.13 A shearing force F applied to the face S of the rectangular solid 
of Figure 16.12 (seen here from directly in front). If the opposite face is held 
fixed, the shear produces the motion shown, in which the planes parallel to S all 
move in the direction of F, changing the originally rectangular cross section into 
a parallelogram. The distances dx and dy are used to define the shearing strain 
in Equation (16.54) below. 


the result of intermolecular forces of the molecules just outside the surface acting on 
those just inside. Figure 16.12 shows three important special cases of surface forces, 
a pressure force, a simple tension, and a shearing force. Notice that both the pressure 
force and the tension act normally to the surface S, whereas the shearing force, by 
definition, acts tangentially to S. The tendency of a shearing force is to produce the 
shearing motion shown in Figure 16.13. 


When Is Pressure Isotropic? 

To conclude this section, I shall prove a result that you probably learned in an 
introductory physics class, that the pressure in any static fluid acts equally in all 
directions, or, briefly, that the pressure is isotropic. Actually, the result is a bit more 
general than this, and I shall prove it in its greater generality. A characteristic property 
of any fluid is that it can support no shearing forces in equilibrium, and the absence 
of shearing forces is in fact the essential feature that leads to the isotropy of pressure. 
I shall prove in a moment that in any substance where there are no shearing forces the 
pressure is isotropic. Clearly the result applies to any static fluid, but it also applies 
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Fo = —p 2 11 1 dt 



= - P \n l dA x 


Figure 16.14 The surfaces S x and S 2 (seen edge on) are normal to the 
two arbitrary unit vectors iij and n 2 . They are identical rectangles and, 
together with S 3 , form an isosceles prism (seen end on). The three forces 
F|, F 2 , and F 3 are the surface forces on the three faces and are normal to 
the surfaces since, by assumption, there are no shearing forces. 


to a moving fluid, provided there are no shearing forces. Since the cause of shearing 
forces in fluids is viscosity, we can say that pressure is isotropic even in moving fluids, 
provided the viscosity is zero. Of course, there are very few fluids whose viscosity is 
exactly zero, but there are plenty of situations where the viscosity is small enough to 
be negligible, the most obvious example being any flow in which all speeds are very 
slow. Our result is obviously very useful, but, more important, the method of proof 
has many other applications, as we shall see. 

Let us consider, then, any medium in which there are no shearing forces, and let 
and n 2 be any two directions. At any particular point in the medium, let us construct 
two small, equal rectangular surfaces, 5) normal to n x and S 2 normal to n 2 , so as to 
form a small triangular prism, as in Figure 16.14. The surface forces on the three faces 
shown are normal to the faces and can be written as F t = — dA { and so on, where 
we write the pressures on the three faces as p h p 2 , and p 3 to allow for the possibility 
that they might be different. (Our aim is to prove that, in fact, p x = p 2 — p 3 .) These 
pressures can, of course, vary from point to point in the fluid, but by considering a 
small enough volume we can ensure that they vary by a negligible amount within our 
volume. We are now ready to apply Newton’s second law to our small prism. The mass 
of the prism is m = gdV. The net force on the prism isF 1 + F 2 + F 3 + F vol where 
F vol denotes the total volume force (for example, the weight F vol = QgdV). Thus the 
equation F = ma becomes 6 


Fi + F 2 + F 3 + F vol — ma 


which we can rewrite as 


F i + F 2 + F 3 = ma - F vol . (16.45) 


6 Strictly speaking we should include the two pressure forces on the two ends of our prism, but 
we shall be concerned only with the components of this equation in the plane of the picture in Figure 
16.14, so we can ignore these. 
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Now comes the supreme act of cunning. Equation (16.45) applies to the small prism 
of Figure 16.14, but it would certainly also apply to a smaller prism. Let us therefore 
shrink our prism by a factor of A in all three directions. The three surface terms on the 
left of (16.45) are proportional to area, so they will decrease by a factor of X 2 . Both 
the mass and volume force on the right are proportional to dV and must decrease by 
a factor of A. 3 . Thus the counterpart of (16.45) for our smaller prism is 

A 2 (F! + F 2 + F 3 ) = A 3 (ma - F vol ) 
or, dividing both sides by A 2 , 


(F, + F 2 + F 3 ) = A(ma - F vol ). (16.46) 

This equation holds for any value of A (smaller than 1). In particular, we can let A 
approach zero, and we reach the surprising conclusion that the three surface terms 
must sum to zero by themselves, 


Fj + F 2 + F 3 = 0. (16.47) 

It is easy to check that, because the triangle in Figure 16.14 is isosceles, this 
requires that Fj and F 2 must have equal magnitudes, F x = F 2 . (Just take components 
perpendicular to F 3 to check this.) Since F, = p { dA] and F 2 = p 2 dA 2 , and d A j = dA 2 , 
we conclude that 


Pi = P 2 - (16-48) 

Since the directions n, and n 2 were arbitrary, we have proved that pressure is inde¬ 
pendent of direction in any medium where there are no shearing forces. In particular, 
pressure is isotropic in any static fluid and also in any moving fluid that has negligible 
viscosity. 


16.6 Stress and Strain: The Elastic Moduli 


As we shall see in the next section, the surface forces inside a continuous body (solid, 
liquid, or gas) can be expressed in terms of a three-dimensional tensor called the stress 
tensor. In Section 16.8, we’ll see that the resulting displacements of the body can be 
expressed in terms of a second tensor called the strain tensor. Finally, before we can 
write down the equation of motion we need to establish the relationship between the 
two tensors. [This last statement is the continuum analog of the familiar requirement 
that to write down the equation of motion of a mass on a spring, we need to know 
Hooke’s law (F = kx), which relates the tension in the spring (F) to the extension 
(x).] The general theory of the stress and strain tensors is quite complicated, so in this 
section I shall mention a few simple special cases before we plunge into the general 
case. 
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Stress 


Since any surface force F is proportional to the area A of the surface on which it acts, 
it is natural to consider the ratio F/A, and this ratio is called the stress. As we shall 
see in the next section, in general we need to discuss a stress tensor, but a simple 
example that lacks this complication is the pressure force in a static fluid, for which 
the stress is just the pressure: 

F 

stress = — = pressure, p [in a static fluid]. (16.49) 

A 

Similarly, you may recall that the stress in a wire or rod subjected to a simple tension 
is defined as 

tension r _ . . . , 

stress =- [for a wire in tension] (16.50) 

area 

where the area is the cross-sectional area of the wire. For a simple shearing force, like 
that in Figure 16.13, the shearing stress is defined as 


shearing force 

stress =- [for a shear]. 

area 


(16.51) 


However complicated the situation, we shall find that the stress (or any component 
of the stress tensor) can be defined as the ratio of a surface force (or one of its 
components) to the area of the surface on which it acts. In particular, stress always 
has the dimensions of [force/area]. 


Strain 

The result of a stress is almost always a deformation, or change in the dimensions, 
of the body on which the stress acted — a change in the volume of a liquid, or the 
length of a wire, for instance. When this change is expressed as a fractional change 
it is called the strain. For example, in a static fluid, the strain would be the fractional 
change in volume, 

dV 

strain = -y [in a static fluid]. (16.52) 

For a wire under tension, the strain would be the fractional change in length, 

strain = y [for a wire in tension], (16.53) 

For the simple shearing force of Figure 16.13, the strain is defined as 
dy 

strain =— [for a shear] (16.54) 

dx 

where dy is the displacement in the direction of shear and dx is the perpendicular 
distance across which the shear occurs (see Figure 16.13). 
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Relation of Stress to Strain: The Elastic Moduli 

When the stresses in a medium are not too large, we would expect that the resulting 
strain would be linear in the stress. In the case of a stretched wire, this relation is 
written as 


stress = (Young’s modulus) x strain or —— YM • — (16.55) 

where YM is Young’s modulus for the material of the wire. 7 [If you rewrite this 
equation as F — {A YM /l)dl, you will recognize it as Hooke’s law, with the force 
constant k = (A YM//). The advantage of writing it in terms of Young’s modulus is 
that YM, unlike k, is characteristic of the material and independent of the dimensions 
A and /.] In Equation (16.55), dl is the extension caused by an increment dF in the 
tension. 

For any material subject to hydrostatic pressure only, a small increase dp in 
pressure will cause a change in volume given by 

dV 

stress = (bulk modulus) x strain or dp = —BM ■ — (16.56) 

where BM is the bulk modulus for the material, and the minus sign is because an 
increase in pressure causes a decrease in volume. For a shearing stress, 

F dy 

stress = (shear modulus) x strain or — = SM • — (16.57) 

A dx 

where SM is the shear modulus for the material. 

To summarize this section, stress characterizes the surface forces in a continuous 
medium, 


force 


(16.58) 


while strain characterizes the resulting deformation, 


strain = fractional deformation. (16.59) 


7 There are almost as many notations for the various elastic moduli as there are books on the 
subject. The notations used here are unconventional, but you will, I hope, be able to remember which 
is which. 
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While stress always has the units of pressure (force/area), strain is always dimen¬ 
sionless. 8 The various elastic moduli (Young’s, bulk, and shear) are the ratios of stress 
to the corresponding strain, 


elastic modulus 


stress _ 

corresponding strain 


( 16 . 60 ) 


16.7 The Stress Tensor 


In this section I shall derive the general expression for the surface force on a small 
area dA of a closed surface S in a continuous medium. As usual, we shall use n to 
denote the unit outward normal to S at the location of dA. To streamline the notation, 
I shall define a vector dA in the direction of n, with magnitude dA. That is, 

dA = ndA. (16.61) 

This vector tells us the orientation and size of the small piece of surface under 
consideration. Our first task is to show that the surface force F(dA) on the surface 
element specified by dA is in fact a linear function of dA, that is, that 

F(V/A, + X 2 dA 2 ) = ^FOiAj) + X 2 F{dA 2 ) (16.62) 

where k x and X 2 are any two real numbers and dA l and dA 2 are any two vectors. 

The force F(dA) is independent of the precise shape of surface element. On the 
other hand, it is proportional to the area dA, so 

F(UA) = AF(JA) (16.63) 

for any positive number k (not too large). If we were to replace dA by — dA, this would 
interchange the inside and outside of our surface, and by Newton’s third law this would 
change the sign of the surface force; that is, F(— dA) = —F(dA). Therefore, Equation 
(16.63) actually holds for negative, as well as positive, values of A.. 

Let us next consider any two small vectors dA } and dA 2 . At any point in our 
continuous medium, consider two small rectangular surfaces touching along one 
common edge, with orientations and areas given by dA x and dA 2 , as shown in Figure 
16.15. Consider now the triangular prism defined by these two rectangles and the third 
rectangle labeled as dA 3 in the figure. This prism has two remarkable properties. First, 


8 There is no very obvious way to remember which is stress and which strain. One possibility is 
this: In everyday language we say “stress causes strain” and likewise “force causes deformation.” 
Thus “stress” goes with “force” and “strain” with “deformation.” Alternatively, note that alphabet¬ 
ically, “stress” comes after “strain” just as “force” comes after “deformation.” 
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Figure16.15 The arbitrary small vectors dA { and dA 2 define two rectangular 
surfaces (seen edge on) which meet at one common edge (bottom right). 

These two rectangles define a triangular prism (seen end on) whose third 
rectangular face is labeled by the vector dA 3 . The surface forces on the three 
rectangular faces are shown as F(dA]) and so on. 

because the three edges shown form a closed triangle, the same is true of the three 
vectors dA h dA 2 , and dA 3 . Therefore 


dA x dA 2 T dA 3 — 0. 


Second, by the same argument as we used to prove the isotropy of pressure in an 
inviscid fluid [Equation (16.47)], we can prove that 9 

F(dAj) + F(dA 2 ) + F(dA 3 ) = 0. 

Exploiting these last two equations in turn, we find that 
F(</A t + dA 2 ) = F(-dA 3 ) = -F(dA 3 ) 

= FidAi) + F(dA 2 ). (16.64) 

Finally, combining (16.63) and (16.64), we can immediately verify (16.62), and we 
have proved that the force F(dA) is linear in dA. 

It is a fundamental result of linear algebra, that if one vector (F in this case) is a 
linear function of a second vector (dA), then the components of the first are related 
to those of the second by a linear relation of the form 10 

3 

F i (dA) = Y^a ij dA j (16.65) 


9 1 am again ignoring the forces on the two triangular ends. If we denote the two ends by dA 4 
and dA 5 , then dA 4 = —dA 5 , so, by (16.63), F(<r/A 4 ) = — F(4A 5 ) and these two forces cancel each 
other. 

10 The proof is quite easy: Suppose that u is a linear function of v. Since n,- = e ( • u and 
v = e j v j’ it follows that n,(v) = e, • u( JT e j v j) = Hjfci * u (e y )]vy, which has the advertized 
form (16.65) with a t j = e ; *u(e y ). 
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or, in matrix form, 


F(dA) = XdA. 


(16.66) 


In this second relation (16.66), X denotes a (3 x 3) matrix , 11 made up of the nine 
numbers cr ;; of (16.65). The matrix X defines a second-rank, three-dimensional tensor 
called the stress tensor. The stress tensor can, of course, vary from point to point in 
the medium, but its significance at each point r is this: For every point r in the medium 
(and at each given time t), there is a unique (3 x 3) matrix X which gives the force 
on any surface element dA at r via the relation (16.66). 


The Elements of the Stress Tensor 

The mathematical significance of the stress tensor X and its elements cannot be 
more succinctly expressed than in the relations (16.66) and (16.65), but to get a feel 
for their physical significance it helps to look at some special cases. For example, let 
us consider a small area dA normal to the x axis, for which dA — e x dA. Since only 
one of the components of dA is nonzero (namely, the first component), the sum in 
(16.65) reduces to a single term. For example, the x component of (16.65) reads 

Fj(on area dA normal to e,) = cr n dA. 

Turning this around, we can say that cr l { is the first component of the force per area on 
a surface perpendicular to the first (x) axis. In the same way, we conclude that cr J( is 
the ith component of the force per area on a surface perpendicular to the ith axis. To 
put this another way, a diagonal element a u of the stress tensor, X, gives the normal 
component of the force per area on a surface perpendicular to the ith axis. 

The off-diagonal elements a- (i jk j ) can be similarly interpreted. Consider again 
the case of a small area dA normal to the x axis, for which the second component of 
(16.65) reads 


F 2 (on area dA normal to e|) = cr 21 dA. 

Evidently ct 21 is the second (y) component of the force per area on a surface perpen¬ 
dicular to the first (x) axis. A similar argument applies to ct 31 , and we can say that 
a 2 \ and cr 31 are the two components of the tangential or shearing force per area on a 
surface perpendicular to the first axis. More generally the six off-diagonal elements 
(jjj (i 5 i. j) tell us the six shearing forces on the three coordinate planes through the 
point under consideration. 


11 Don’t confuse the boldface capital Greek “sigma,” X, in (16.66) with the summation sign in 
(16.65). 



Section 16.7 The Stress Tensor 


707 


The Stress Tensor Is Symmetric 

The six off-diagonal elements o i} are not actually independent, since they are equal in 
pairs. Specifically, the stress tensor X is symmetric, so that a i} — a jh as we can now 
prove using, yet again, the argument introduced at the end of Section 16.5 to prove 
the isotropy of pressure in an inviscid fluid. This time, we’ll consider a small prism, 
whose axis is in the z direction and whose cross section is a square, parallel to the 
xy plane, as shown in Figure 16.16. The prism’s angular momentum about its axis 
satisfies 


— = T 3 , (16.67) 

dt 

where T 3 is the z component of the net torque on the prism. The four forces (actually 
components of forces) that contribute to T 3 are the shearing forces shown in the figure 
as F a , F b , F c , and F d . From (16.65), we see that F a = o n dA, while F b — o 2 \ dA .The 
forces F c and F d are equal in magnitude to F a and F h respectively, but in the opposite 
directions. Thus the total torque T 3 is 

r 3 = F b l - FJ = (<r 21 - o n )l dA. (16.68) 

Using the now-familiar trick, we next reduce our prism by a factor of A in all three 
directions. In (16.67) this reduction multiplies r 3 by a factor of A 3 , but L 3 by a factor 
of A 4 . If we divide by A 3 and let A 0, we see that r 3 must in fact be zero. According 
to (16.68), this implies that cx 2 \ = °\ 2 - Similar arguments take care of the other off- 
diagonal elements, and we conclude that 

&i$ = a ji (16.69) 

for all i and j. That is, the stress tensor X is symmetric. 



Figure 16.16 End view of a square prism with its axis parallel to the z axis. 
The four forces that contribute to the prism’s rotation about its axis are shown 
as F a , F b , F c , and F d . The square ends have side l and the four faces that are 
seen end-on here have area dA. 
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example 16.3 The Stress Tensor in a Static Fluid 

Write down the stress tensor X in a static fluid at a point where the pressure is p. 

We know that in a static fluid there are no shearing forces and that, at any 
given point, the pressure is the same in all directions. Thus the surface force on 
a small surface element labeled by dA is just 

F(t/A) = —pdA 

where p is a constant (at any one point of the fluid), independent of dA. (The 
minus sign is because dA points outward, whereas the pressure force is inward.) 
Comparing this with the definition (16.66) of the stress tensor, we see that 

11 =-pi (16.70) 

where 1 is the (3 x 3) unit matrix. This beautifully simple result expresses 
succinctly that in a static fluid (and also a moving fluid provided the viscosity is 
negligible) the only surface force is the pressure force, which is normal to the 
surface and independent of the surface’s orientation. 12 

example 16.4 A Numerical Example of Stress 

At a certain point P in a continuous medium the stress tensor has the value 


E = 


-1 

2 

0 


2 

-2 

0 


(16.71) 


A small surface element at P has area dA and is parallel to the plane x + y + z = 
0. What is the force on this surface element and what is the angle between this 
force and the normal to the surface element? To be definite, take P to be in the 
positive octant (x, y,z all positive), and assume that the outside of the surface 
is the side away from the origin. 

Notice first that I have not specified the units of the components of X, but 
they would of course be the units of pressure (force/area). If our medium was 
the water at the bottom of a river they might be kilopascals; for a rock in the 
earth’s crust, they might be megapascals. 

The force on our surface element is given by (16.66) with dA = ndA, where 
n is the unit normal to the surface. We are told that the surface is parallel to the 
plane x + y + z — 0, so 13 


12 This example suggests a neat alternative proof of the isotropy of the pressure force. The 
absence of any shearing forces means that X) must be diagonal (all off-diagonal elements equal 
to zero) with respect to any choice of axes. It is easy to show that the only tensor with this property 
is a multiple of the unit matrix. 

13 There are several ways to see this. One simple one is to note that the plane is given in the form 
f(x,y,z) = constant, and it is a standard result of vector calculus that the vector V/ is normal to a 
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' V3 


Therefore the force on the surface element is 


a A 

F(r/A) = JAZn = —~ 

V3 


The angle 9 between the force and the normal is given by 


' |F| • |n| y/2j3 x 




(16.72) 


(16.73) 


So 9 = arccos( v / 2/3) = 35.3°. This last illustrates the obvious fact that, in the 
presence of shearing forces, the force on a surface element is not necessarily 
normal to the surface. 


16.8 The Strain Tensor for a Solid 


In the last section we saw how the stress tensor expresses the surface forces inside 
a continuous medium, solid, liquid, or gas. This section presents a corresponding 
discussion of the strain tensor as a description of the displacements within the medium. 
Unfortunately, in this case, the analysis of solids is quite different from that of fluids, 
and for simplicity I shall confine myself to the discussion of solids. 14 

To specify the configuration of a continuous solid we must give the position of 
each of its continuously many constituent pieces. A convenient way to do this is to 
specify that the particular small volume dV that was “originally” at position r is 
now at position r + u(r). The “original” position could be its equilibrium position 
or just the position at some convenient initial time t 0 . Either way, the vector u(r) is 
the displacement needed to move the piece from its reference position r to its current 
position. 

At first glance, you might think that u(r) would be a good measure of the strain of 
the body, but it isn’t hard to see that this is not so. Consider for example, the possibility 
that u(r) = u 0 is just a constant (independent of r). Such a displacement simply moves 
our whole body rigidly through the vector u 0 and requires no internal stresses at all. 
Stresses arise not so much from displacement of our solid as from distortion , and 
distortions require that different parts of the body are displaced by different amounts. 


surface of this form (see Problem 4.18). Therefore, n = ±V//| V/|. Since n must point away from 
the origin, the plus sign applies, and it is easy to check that this gives the result (16.72). 

14 It is easy to see that there has to be a big difference between solids and fluids. For example, a 
change in the shape of a solid certainly constitutes a strain and usually entails appreciable stresses; 
but a change in shape of a fluid (transferring milk from a square carton to a round bowl, for example) 
does not normally constitute a strain since it requires no stresses. 
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(b) 


Figure 16.17 (a) In a rigid translation, all points in the body are displaced by the 

same amount. That is, u(r) is the same for all r, and the separation dr of any two 
neighboring points is unchanged, (b) Any distortion of the body requires that u(r) 
vary from point to point. Here the points r and r + dr move by different amounts, 
and their separation changes from dr to dr + du. 


as illustrated in Figure 16.17(b). Figure 16.17(a) shows a rigid translation, with u(r) 
the same for all r, so that the separation, dr, of any two neighboring points remains 
the same. In part (b), u(r) and u(r + dr) are different, and the separation of the two 
neighboring points changes from dr to dr + du. The change du in their separation 
can be expressed in terms of the derivatives of u with respect to r: 


dui = J2 


dU: , 
— -dr: 
dr, J 


(16.74) 


or, in matrix notation. 


du = Ddr (16.75) 

where D is the derivatives matrix (or derivatives tensor) made up of the partial 
derivatives dujdrf. 



~du l /dr l 

du l fdr 2 

dujdrf 


0 = 

du 2 /dr l 

du 2 fdr 2 

du 2 /dr 3 

(16.76) 


_du 3 /dr l 

du$jdr 2 

3« 3 /8r 3 _ 



The elements of the derivatives matrix D tell us how rapidly the displacement u(r) 
varies as we move around inside the solid, and you might reasonably guess that D 
would be a good measure of strain. Unfortunately there is one more complication to 
discuss. We have already noted that a rigid translation of our solid should not count 
as a strain, and the same is true of any rigid rotation. Thus, we must examine what 
form D would take for a rigid rotation and then, for an arbitrary displacement given 
by D, somehow subtract out of D that part which corresponds to a rigid rotation, to 
leave something that truly represents what we want to mean by strain. 

In what follows we shall be concerned only with small strains (meaning that all 
of the derivatives in D are much less than 1). So let us consider a small rigid rotation 
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which we can label by a vector 6 = du, where the unit vector u identifies the axis of 
rotation and 6 is the (small) angle of rotation. It is not hard to calculate the resulting 
displacement of any point r from scratch, but we can save a little trouble by recalling 
Equation (9.22), v = co x r, for the velocity of a point r in a rigid body rotating 
with angular velocity co. Multiplying both sides by a small time dt, we find that the 
displacement u(r) is 15 


u(r) = \dt — co dt x r — 0 x r 


(16.77) 


since 0 — codt. If you write out the components of this equation and differentiate, you 
can easily check that du = D dr as in (16.75) where 


D = 


" 0 

-0 3 
_ 02 


03 


0 

-01 


— 1 02 1 


01 

0 


[any small rotation]. 


(16.78) 


That is, for any small rotation given by the vector 0 = (6 h 0 2 , 0 3 ) the derivatives 
matrix is given by the antisymmetric matrix (16.78). (A matrix M is antisymmetric 
if M = —M.) Conversely, any antisymmetric matrix has the form (16.78), so any 
antisymmetric matrix (with all its elements small) is the derivatives matrix for a small 
rotation. Therefore, if the derivatives matrix (16.76) is found to be antisymmetric, it 
corresponds to a rotation and should not be considered a strain. 

To exploit the result of the last paragraph, we need to use an elementary theorem 
from matrix theory, that any square matrix M can be written as the sum of two matrices, 
one of which is antisymmetric and one symmetric, as we can easily verify from the 
following obvious identity: 


M= |(M -M) + i(M + M). (16.79) 

Since the first of these is clearly antisymmetric and the second symmetric, this 
proves the claimed theorem. The derivatives matrix, D, of any displacement can be 
decomposed in this way as 


D = A + E. (16.80) 

Here A is the antisymmetric part of D; it represents a rigid rotation and does not 
contribute to the strain. The second term E is called the strain tensor ; 16 it is the 
symmetric part of D, 


E = | (D + D) 


( 16 . 81 ) 


15 It is important that dt, and hence 9, be small. Otherwise, v will change appreciably during the 
rotation. 

16 The terminology here is hopelessly nonuniform. The “strain tensor” has many slightly differ¬ 
ent definitions and is denoted by many different symbols. About the best one can say of the usage 
here is that at least some other authors use it. 
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where D is the derivatives matrix, as defined in (16.76). The strain tensor defined in 
this way is a good measure of strain as the following two examples illustrate. 


example 16.5 Dilatation 

The strain tensor E at a certain point P in a solid is a multiple of the unit matrix, 

E = el (16.82) 

where e is a small number, positive or negative. Describe the displacement of 
points in the neighborhood of P (which, for convenience, we can take to be the 
origin). 

Since we are not interested in overall rigid displacements or rotations, we 
may as well assume that the point P does not move and that the immediate 
neighborhood of P is unrotated. In this case, the antisymmetric part A of D 
in (16.80) is zero and D = E. Thus a point at position dr (relative to P) is 
displaced to dr + du, as illustrated in Figure 16.18, where dw = E dr = edr. 
That is, the point dr is moved to (1 + e)dr. Since this statement is independent 
of the direction of dr, we conclude that the whole sphere of any small radius dr 
is enlarged, or dilated, in all directions by the factor 1 + e, and we refer to the 
strain (16.82) as a spherical strain or a dilatation. If e is positive, the sphere 
is actually enlarged; if e is negative, it is reduced. 



Figure 16.18 The strain (16.82) moves the point at dr radially out to 
(1 + e)dr. Thus any small sphere centered on P is just dilated by a 
factor of 1 + e in all directions. 


For future reference, notice that since any volume is stretched by a factor of 
(1 + e) in all three directions, volumes are increased by a factor of (1 + e) 3 ~ 
1 + 3e. (Remember that e <§C 1.) In other words, the dilatation E — el results in 
a fractional increase of 3e in any small volume; that is, 


(16.83) 
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example 16.6 A Shearing Strain 

The strain tensor E at a certain point P in a solid has the form 


"0 

Y 

0" 


Y 

0 

0 

(16.84) 

_0 

0 

0 _ 







(with y <^l) or, if we denote the elements of E by e,- 7 , then e 12 = e 21 — Y while 
all other are zero. Describe the displacement of points in the neighborhood 
of P (which we can take to be the origin again). 

As before we’ll assume that there is no overall translation or rotation, so that 
E is the same as the derivatives matrix D, whose only nonzero elements are 

3 u x _ 3 u 2 _ 

3 r 2 3 r x ^ 

This implies that if we move out along the r 2 axis, the only component of u that 
changes is u x . Therefore, any point on the r 2 axis is displaced sideways, in the r x 
direction, as shown in Figure 16.19. Similarly, a point on the rj axis is displaced 
upward in the r 2 direction. The net effect is a shear in which the two axes are 
tilted toward one another as shown in the figure. (This is if y is positive; if y is 
negative, they tilt the other way.) The angle through which both axes tilt is equal 
to the parameter y, as long as y is small. 



Figure 16.19 Under the strain (16.84), points on the r 2 axis move in 
the r x direction and vice versa. The result is a shear in which the two 
axes tilt as shown. ) 

We see from this last example that the off-diagonal elements fe of the strain 
tensor are associated with shearing strains. In the same way, the diagonal elements are 
associated with stretching along the axes. For instance if e n is nonzero, then points 
on the r, axis are displaced along the axis (in addition to any sideways displacement), 
and the whole axis is stretched by a factor 1 + e u . For this reason, the three diagonal 
elements e n , e 22 , and e 33 can be called the stretching elements of E. 
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Decomposition of the General Strain Tensor 

The last two examples lead us to our final maneuver with the strain tensor. We’ve 
seen that if E is diagonal and its diagonal elements are equal, e n e 2 2 = e 33 — e -> so 
that E = el, then the corresponding deformation is the simple dilatation of Example 
16.5. Even if a given strain tensor E does not meet these conditions, you might guess 
that we could define e as its average stretch, that is, the average of the three diagonal 
elements, 


e — |( e n + e 22 + f 33 ) (16.85) 

and that the simple dilatation el would bear some useful relation to the original tensor 
E. Before I show that this is the case, I should mention that, in the theory of matrices, 
the sum of the diagonal elements is an important concept called the trace of the matrix. 
That is, for any (n x n) matrix M with elements m ;; , we define its trace, trM, as the 
sum 

trM = Y' m u = m n + • • • + m nn . (16.86) 

i =1 

Thus another way to state the definition (16.85) of the average stretch e of any strain 
tensor E is that it is 1/3 of the trace: 


e — | tr E. (16.87) 

The matrix el is a pure dilatation, which naturally changes the volume of any small 
region around the point of interest, and it can be shown (Problem 16.24) that it changes 
the volume by the same amount as the original strain E. Therefore, if we write 

E = el + E' (16.88) 

we have expressed E as the sum of two separate strains, the first of which is a pure 
dilatation that changes volumes by the same amount as E and the second of which gives 
the same shearing strains as E but causes no change of volumes. We can call the first 
term, el, the spherical part of E. The second term, E' = E — el, is sometimes called 
the strain deviator or deviatoric part of E, presumably because it is the amount 
by which E deviates from the corresponding pure dilatation. Mathematically we can 
characterize the decomposition (16.88) by saying that the first term is a multiple of 
the unit matrix with the same trace as E and the second has zero trace. We shall see 
in the next section that this decomposition of E plays an essential role in the relation 
between strain and stress. 


example 16.7 A Numerical Example of Strain 

The strain tensor at a certain point in a solid is found to be 


E = 


"-0.01 0.02 0.05" 
0.02 0.03 0.04 

0.05 0.04 0.04 _ 


(16.89) 
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Decompose this strain as in (16.88) into its spherical and deviatoric parts. 


The average stretch is easily seen to be e = jtr E = 0.02 and E ; is then found 
by subtraction as E' = E — el. So 



"0.02 

0 

0 “ 


“ —0.03 

0.02 

0.05“ 


el = 

0 

0.02 

0 

and E' = 

0.02 

0.01 

0.04 

. (16.90) j 


0 

0 

0.02 _ 


_ 0.05 

0.04 

0.02 _ 












It is easy to check that the original strain tensor E is indeed equal to el + E'. 
Notice that the trace of el is the same as that of E, as it should be, while the 
trace of E' is zero. 


16.9 Relation between Stress and Strain: Hooke’s Law 


The final step in writing down the equation of motion for a continuous solid is to find 
the relation between the stress and strain tensors, X and E. This relation, sometimes 
called the constitutive equation, corresponds to Hooke’s law for a mass on a spring, 
expressing the force (or stress) in terms of the extension (or strain). It is reasonable 
to assume that, at least for small disturbances, the relation between stress and strain 
should be linear, and this is what I shall assume here. There certainly are plenty of 
examples of materials that fit this assumption reasonably well — a chunk of rubber 
or metal, or even the rock of the earth’s crust. When the required relation is linear, 
it is called the generalized Hooke’s law. To simplify the discussion still further (and 
this is a huge simplification), I shall assume that the solid is isotropic, which implies 
that the relation between X and E is independent of our choice of axes, or rotationally 
invariant. 

We wish to express the stress tensor X as a function X = /(E) of the strain tensor 
E. The function / must be linear, and it must be rotationally invariant. Linearity is a 
familiar property. To be rotationally invariant means this: If R denotes any rotation of 
our coordinate axes and M# the result of rotating a matrix M by the rotation R, then 
it must be true that 


/(E*) = [/( E)] r (16.91) 

for any strain E and any rotation R. That is, the stress corresponding to the rotated 
strain E ff (left side of the equation) must be the same as the result of rotating the 
stress corresponding to E (right side of the equation). Now, we have seen that any 
strain tensor can be decomposed as the sum of its spherical and deviatoric parts: 

E = el + VJ. (16.92) 

This decomposition has two important properties. First, it is rotationally invariant; 
that is, when we rotate axes, each part rotates separately into the corresponding part 
of the rotated tensor. (The spherical part of E rotates into the spherical part of E w , and 
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likewise the deviatoric part.) Second, it is impossible to decompose E any further and 
retain this property. 17 

The crucial result, the proof of which is unfortunately beyond the scope of the 
mathematics I am assuming here, is this: If Z = /(E) where the function / is linear 
and rotationally invariant, and if E is decomposed as in (16.92), then the most general 
possible form of the function / is this: 


X = or^l + /E' 


(16.93) 


where a and / are two constants (which depend on the material of which our solid is 
made) and e = ] tr E as usual. 18 The relation (16.93) is called the generalized Hooke’s 
law, or just Hooke’s law, and any solid that obeys it is called an elastic solid. It is often 
convenient to rewrite Hooke’s law (16.93) in terms of E (rather than E' = E — el), to 


Z = (a- p)el + /E. 


(16.94) 


We can solve this for E in terms of X in two steps. Taking the trace of (16.93) we find 
that tr X = 3ae, so e = tr X/3a. Substituting this into (16.94) we find that 


E=—[3aX-(a-£)(trZ)l]. 
3a/ 


(16.95) 


As you might guess, the constants a and / are related to the elastic moduli 
introduced at the end of Section 16.6 [Equations (16.55) to (16.57)]. Let’s start with 
the bulk modulus. 


17 In the language of group theory, the two parts in (16.92) are irreducible. For a discussion of 
the necessary group theory see Chapter 10 of Mathematics for Scientists and Engineers by Harold 
Cohen, Prentice Hall (1992) or Chapter 16 of Mathematical Methods of Physics by Jon Mathews 
and R. L. Walker, W. A. Benjamin (1970). 

18 Within the framework of group representations, the proof is amazingly simple: The linear 
function / must commute with all rotations. The decomposition (16.92) splits the space of all 
symmetric matrices into two irreducible subspaces (of dimensions 1 and 5 respectively). By Schur’s 
lemma, the restriction of / to either of these irreducible subspaces can be at most multiplication by 
a scalar (which we can call a or f ) and we have proved (16.93). 

19 This equation is often written as X = 3kel + 2/xE, in which case k and jx are called the Lame 
constants. 
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Bulk Modulus 

Imagine a solid subject only to an isotropic pressure, p (no shear stresses). In this case 
we know that the stress tensor has the simple form E = — pi. Substituting this into 
Hooke’s law (16.95) we find that 

E = — [—a + (a — fi)]pl = — — 1. (16.96) 

a/3 a 

That is, the strain tensor E is also a multiple of the unit matrix, which we could write 
as E = el, with e = —p/oc. But we know from (16.83) that e = \dV / V. Comparing 
these two expressions for e, we conclude that p = — \ctdV/V, and comparing this 
with the definition (16.56) of the bulk modulus, we see that 

a = 3BM. (16.97) 


Shear Modulus 

Let us consider next the simple shearing strain E given in (16.84) of Example 16.6 and 
illustrated in Figure 16.19. Since this has zero trace, e = 0, and Hooke’s law (16.94) 
reduces to the simple form 


Z = y0E. 

In particular, the stress responsible for the shear is 

T = <*12 = = j8y. (16.98) 

A 

We need to compare this with the definition (16.57) of the shear modulus. Unfortu¬ 
nately, (16.57) refers to the strain of Figure 16.13, which is not quite the same as 
that of Figure 16.19. Specifically, in Figure 16.19 both axes tilt inward by an angle 
y, whereas in Figure 16.13 the x axis turns through angle 9 while the y axis is un¬ 
changed. A moment’s thought should convince you that the displacement of Figure 
16.13 is a combination of a simple shear, as in Figure 16.19, followed by a rotation to 
bring the y axis back to its original direction. This means that the angle 6 of Figure 
16.13 equals twice the angle y of Figure 16.19 ,9 = 2y. Putting this into the definition 
(16.57) of the shear modulus, we find that 

— = SM — = SM 9 = 2 SM y, 

A dx 

and comparing with (16.98), we conclude that the constant is just twice the shear 
modulus. 


/3 = 2SM. 


(16.99) 
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Let us now return to the equation of motion (16.101) and substitute (16.102) for 
F vol and (16.111) for F sur . When we do this, every term acquires a factor of dV, which 
we can cancel to give 


dt z 

This important equation is easy to understand. The left side represents the raa of 
ma = F. The first term on the right represents the force of gravity (or more generally 
the body force) and the second the surface force. 

Before we can use this equation of motion, we must use Hooke’s law to replace 
the stress tensor X by the strain tensor E. In the form (16.94), Hooke’s law reads 


X = (a - P)el + pE 


or, in terms of its components, 

o n ~(a- P)e8 ji + Pep 

where <5 ;7 denotes the Kronecker delta symbol 

. f 1 if j = i 
ji ~ i 0 if//; 

Since e yi - = \{8jU\ + 3,-m/, the average stretch e is 

e = \ e u = I 3 iUi = 5 V • u. 


(16.113) 


(16.114) 


(16.115) 


Putting these results into (16.114), we find 

Oj t — |(a - P)8 ji W -u + ^/3(djUj + 3 jUi). 

This lets us evaluate V • X for use in (16.112): 

(v-d,= y> jojt= 1<« - «E ■"> + le'EW'-j + 

j | j j 

Each of the terms in this ugly result simplifies. In the first term, notice that J2j $j$j = 
Sj . In the second term, Ylj dfii u j = ft J2j ®j u } = ft v *u, and in the third, J2j fyft/ — 
V 2 . Therefore 

V • X = |(a - 0)V(V -u) + l £V(V -u) + ,|^V 2 u 
= (|a + ^)V(V.u) + ^V 2 u 

= (BM + |SM)V(V • u) + SMV 2 u (16.116) 

where in the last line I have used (16.97) and (16.99) to rewrite a and ft in terms of 
the bulk and shear moduli, BM and SM. 



Section 16.11 Longitudinal and Transverse Waves in a Solid 

Finally we are ready to write the equation of motion of an elastic solid in a usable 
form. Sustituting (16.116) into (16.112), we find 


q fLH = £g + (BM -F |SM) V(V >u) + SMV 2 u. (16.117) 


In the next section, we’ll use this equation, often called the Navier equation (after the 
French engineer Claude Navier, 1785-1836), to derive the two main kinds of wave in 
an elastic solid. 


16.11 Longitudinal and Transverse Waves in a Solid 


It is well known that there are two main kinds of wave in an elastic solid-longitudinal 
and transverse. To show this, we’ll examine the Navier equation (16.117). We’ll 
assume that gravity is unimportant and set g = 0. (This is usually an excellent ap¬ 
proximation, one exception being very slow — r ^ 200 s — free oscillations of the 
earth, for which gravity is important.) Without loss of generality, we’ll look for a plane 
wave propagating in the x (that is, r,) direction, so that u depends only on x and t. 

Let us first examine the possibility of a longitudinal disturbance, for which the 
displacement u would be in the direction of propagation, 


u = [u x (x, t ), 0, 0], 


In this case, V • u = du x /dx and the only nonzero component of V(V • u) is its x 
component, which is d 2 u x /dx 2 . The only nonzero component of V 2 u is likewise the 
x component, which is also equal to d 2 u x /dx 2 . Putting all of this together in the 
equation of motion (16.117), we obtain 

e ^ = (BM+lSM)^. (16.118) 

This is the wave equation, with wave speed 


c long — . 


BM + fSM 


(16.119) 


We conclude that longitudinal waves are indeed possible, with speed c long given by 
(16.119). 

If instead we look for a transverse (or shear) wave, traveling in the x direction but 
with the displacement in the y direction, then u would have the form 
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u = [0, u y (x, t), 0], 
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In this case V • u = 0, while V 2 u has only a y component equal to d 2 u y /dx 2 , so 
(16.117) becomes 


dt 2 


t 8 S 

dx 2 ' 


This is the wave equation, with wave speed 


Gran — 



(16.120) 


and we conclude that transverse waves are possible, with speed c tran given by (16.120). 

Notice that c long > c tran . Therefore, if longitudinal and transverse signals set out 
simultaneously from some source, the longitudinal one will arrive first at a distant 
detector. For instance, it is well known in the earth sciences that the longitudinal 
waves from a distant earthquake arrive before the transverse ones, and this gives 
seismologists a way to measure how far away an earthquake or explosion occurred. 
For this reason, longitudinal waves are also called primary or P waves and transverse 
secondary or S waves. 


example 16.8 Waves in Rock 

The elastic moduli of the material of the earth’s crust vary, but representative 
values would be BM ~ 40 GPa and SM 25 GPa. (These are the approximate 
values for granite, whose density is about 2.7 x 10 3 kg/m 3 , and 1 GPa = 10 9 Pa 
= 10 9 N/m 2 .) What would be the speed of longitudinal and transverse seismic 
waves in rock with these values? 

According to (16.119), the longitudinal speed is 




(40 + 3 x 25) x 10 9 N/m 2 
2.7 x 10 3 kg/m 3 


Similarly, from (16.120) we find the transverse speed to be 


25 x 10 9 N/m 2 
2.7 x 10 3 kg/m 3 


A striking feature of the formula (16.120) for the transverse wave speed is that if 
the shearing modulus SM is zero — as it is in fluids — then c tran is zero. This suggests 
(correctly) that fluids cannot support transverse waves . 21 A beautiful application of this 
result is that tranverse seismic waves (unlike longitudinal) are found not to propagate 
through the center of the earth, showing that some region near the earth’s center 
(namely, the outer core) is liquid. 


21 Of course this argument is not quite watertight since our derivation of (16.120) assumed a 
solid medium. Nevertheless the suggested conclusion is correct, and correct for essentially the right 
reason: Transverse waves require a transverse (shearing) restoring force, which a fluid cannot supply. 
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16.12 Fluids: Description of the Motion* 


* A.? usual, sections marked with an asterisk can be omitted on a first reading. 

In the last four sections we have focussed primarily on the motion of solid contin¬ 
uous media. In the final two sections of this chapter, I would like to give a brief 
introduction to the motion of fluids. Unfortunately, the analysis of a general fluid — 
particularly a viscous fluid — is complicated, and, to keep this chapter from growing 
unconscionably long, I shall restrict myself mostly to inviscid or ideal fluids, that is, 
fluids whose viscosity is negligible. One might argue that to neglect viscosity is to 
throw the baby out with the bathwater, but the fact is that there are many problems in 
fluid motion where it is reasonable to neglect viscosity, and, more important, all of the 
tools needed to discuss inviscid fluids are needed for the analysis of viscous fluids, 
so the discussion here is really an essential preliminary to any subsequent study of 
viscous fluids. Furthermore, several ideas discussed here — the convective derivative 
and the equation of continuity, in particular — are equally applicable to either case. 


Material versus Spatial Descriptions 

So far we have analyzed what is happening in a continuous medium by specifying that 
the piece of material that was originally at position r is now at position r + u(r, t). 
This approach is sometimes called the material description, since it focuses on a 
particular piece of the material. It turns out that in discussing fluids it is often more 
convenient not to follow the individual pieces of fluid, but rather to specify what 
is happening at each fixed point in space. Thus we might give the velocity v(r, t), 
the density g(r, t), and so forth, of the fluid at each fixed point r (and time t). This 
approach is often called the spatial description. 22 

Some advantages of the spatial approach when discussing a fluid are reasonably 
obvious: With a solid, each material piece of the solid usually has a well-defined 
equilibrium position r, and this is what we use to label each piece [giving the piece’s 
displacement u(r, t) from r]. In a fluid, the pieces of the fluid do not generally have 
a unique equilibrium position. We could, of course, use the piece’s initial position as 
a reference, but in a problem of fluid flow, the pieces usually stray inconveniently far 
from their initial positions. Thus in discussing a fluid, it is usually more convenient 
simply to focus on what is happening at each fixed point in space — the spatial 
description. 

The Material Derivative 

The main disadvantage of the spatial description emerges most clearly when we try to 
use Newton’s second law. The acceleration a in F = ma is, of course, the acceleration 
of any small piece of the fluid. Unfortunately, if v(r, t) is the fluid’s velocity at a 


22 The material and spatial descriptions are often called the Lagrangian and Eulerian descrip¬ 
tions, respectively — names that are neither historically accurate (both methods being due to Euler) 
nor especially easy to remember. 
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point r, then a is not just dv/dt. The partial derivative 9v/3 1 is the rate of change of 
the fluid velocity at the fixed point r, whereas the acceleration we want is the rate of 
change of v of a given piece of fluid as it moves. This same distinction applies to any 
other parameter and is, perhaps, easier to visualize in the case of a scalar, such as the 
density or temperature. Consider, for example, the density of a small element of the 
fluid. If at time t the element is at position r, then the required density is q(r, t). But 
a short time dt later, the element will have moved to a new position r + dr, where 
dr — \dt, so the density is now p(r + dr, t + dt). (Note well how, if we wish to 
follow the material element, we must evaluate the new density at the new time t + dt 
and the new position r + dr.) Thus the change in the density of our volume element is 

dq = £(r + dr,t + dt) - q( r, t) — —dt + dr • Wq 
dt 

= —dt + ( \dt) • Vq. 
dt 

Dividing both sides by dt we find for the time derivative of q 


— — ~ + v * V(?. (16.121) 

dt dt 

This derivative is called the material derivative since it gives the rate of change of q 
as we follow the motion of a material piece of the fluid. 23 

We can apply a similar argument to other parameters of the fluid. For example, 
dp/dt (defined in exactly the same way) would be the rate of change of the pressure 
as we follow the motion of a material element of fluid. In particular, we can examine 
each component of the fluid’s velocity and putting the three components together find 


„ = _ 4 . v .Vv. (16.122) 

dt Bt 


This derivative is, of course, the acceleration of the volume element of fluid. Armed 
with this result, we are ready to write down the equation of motion for an inviscid 
fluid. 


Equation of Motion for an Inviscid Fluid 

Consider now a small volume element dV of fluid with mass q dV and acceleration 
given by (16.122). Newton’s second law implies that 


23 Other names for the material derivative are total or convective derivative. Some authors use 
the symbol D/Dt (instead of d/dt) to emphasize the special character of this derivative. 
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QdV — =F = F vol + F sur . (16.123) 

at 

I shall assume that the only volume force is gravity, so that F vol = gdV g, where 
g is the acceleration of gravity. We know from (16.111) that the surface force is 
F sur = V • Hd V and from (16.70) that £ = —pi. (This last was derived for a static 
fluid, but the argument needed only that the viscosity be negligible.) To evaluate V • X, 
we’ll take components: 

(V • S)j = Y. Vi » = - E 3 i(P s ik = ~ 3 ‘ P- 

j .i 

Therefore, V • X = -Vp. Putting all these results together in (16.123), we see that 
every term contains a factor of dV, and, canceling this factor, we arrive at the equation 
of motion for an inviscid fluid, 


dx 

6~r=Qg~~Vp. 

dt 


(16.124) 


Bernoulli’s Theorem 


As a first simple application of the equation of motion (16.124), let us derive a familiar 
result from most introductory physics courses, Bernoulli’s theorem. This theorem is 
usually stated for the case of steady flow of an incompressible, inviscid fluid, and this 
is the case I’ll consider here. That the flow is steady means that the parameters p, v, 
and q at any fixed point r are constant; that is, the partial derivatives dp/dt and so on 
are all zero. That the fluid is incompressible means that the density doesn’t change, 
dg/dt = 0. (Notice that in the case of the density, both dg/dt and dg/dt are zero.) 
I shall consider the case that g is uniform and in the negative z direction, so that we 
can write g = -V(gz). 24 Let us make this replacement in (16.124), and then dot the 
whole equation with v. This gives 

d\ 

gx -b gx • V(gz) + v • Vo = 0. (16.125) 

dt 

We can simplify all three terms in this equation. For the first term, note that v • dx/dt = 
\d(v 2 )/dt. For the second and third terms, note that v • V/ = df/dt — df/dt, for any 
function /. For the functions of interest here, df/dt = 0, so v • V/ = df/dt. With 
these replacements, (16.125) becomes 


-e—+e^^ + —=o. 

2 dt dt dt 


(16.126) 


24 Even if g is not uniform, it is certainly conservative, so we can always introduce a function <t> 
(called the gravitational potential) such that g = — V<t>. To avoid introducing unfamiliar symbols, I 
decided to treat the common case that <t> = gz. 
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Finally, since dg/dt = 0, we can bring the factors of q inside the derivatives and we 
find that 

Y t + Q8 Z + p) = 0. (16.127) 

This asserts that the quantity'T = \gv 2 + Qgz + p is constant as we move along with 
any material element of the fluid. In other words, 4* is constant along any streamline, 
which is precisely the content of Bernoulli’s theorem for steady, incompressible, 
inviscid flow. Because the term \gv 2 appears with the pressure p in the Bernoulli 
4» and is associated with the motion, ±gv 2 is sometimes called the dynamic pressure. 


The Equation of Continuity 

The conservation of mass implies an important relation, called the equation of conti¬ 
nuity, between the density and velocity of any fluid, viscous or inviscid. If you have 
taken a course in electromagnetism, you may have met a corresponding relation that 
reflects the conservation of charge. To prove the relation, consider a small volume dV 
of fluid. As this volume moves, its mass gdV cannot change. Therefore, 

— ( Q dV) = dV^-+g—(dV) = 0. (16.128) 

dt dt dt 

The rate of change of a moving volume was evaluated in Equation (13.59) of Section 
13.7. (If you did not read Chapter 13, you could read just the proof of this result, 
starting at the subsection “Changing Volumes” on page 544.) For any volume V (small 
or large) the result is 


dV_ 

dt 


*xdV. 


For an infinitesimal volume dV, this reduces to 


-(dV) = V -xdV. 


Inserting this in (16.128) and canceling the common factor of dV, we arrive at the 

equation of continuity, 


dQ 

dt 


4- gV * v = 0 


(16.129) 


or equivalently (see Problem 16.34) 
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This relation plays a crucial role in fluid dynamics, as does the corresponding relation 
(with q replaced by the charge density) in electrodynamics. We shall use it in the next 
section in deriving the speed of sound in a fluid. 


16.13 Waves in a Fluid* 


*As usual, sections marked with an asterisk can be omitted on a first reading. 

Armed with the equations of motion and continuity, we can now discuss the possibility 
of waves in an inviscid fluid. We imagine a fluid which undergoes a small disturbance 
from equilibrium, so that it acquires a small (presumably oscillatory) velocity v and 
its pressure and density become 


P = P 0 + P' (16.131) 

and 

q = Qo + q'. (16.132) 

Here p 0 and q 0 are the equilibrium values, and p' and q' are small increments. Notice 
first that in equilibrium, the equation of motion (16.124) implies simply that 

<?og~Vp o = 0. (16.133) 

For the actual situation, we insert (16.131) and (16.132) into the equation of motion 
to give 


(Co + c') + v • Vv ^) = (Co + C')g - V(p 0 + p'). 

This ugly equation simplifies. First, by the equilibrium condition (16.133), the first and 
third terms on the right cancel exactly. Second, by the assumption that the disturbance 
is small, we can drop any terms that are second order or higher in the small quantities 
v, q\ or p', or their derivatives. Thus we can ignore the term v • Vv, and likewise the 
terms involving q' on the left. This leaves us with 

Qo^- = Q'g~Vp'- (16-134) 

at 

Finally, it is not hard to show (Problem 16.38) by putting in realistic numbers that the 
term p'g on the right is negligible compared to V p'. Thus the equation of motion for 
our small disturbance is just 




(16.135) 



728 Chapter 16 Continuum Mechanics 

We can treat the equation of continuity (16.130) in a similar way. Inserting (16.131) 
and (16.132), we get (as you should check) 

= -<? 0 V*v- v-Vg 0 . (16.136) 

3 1 

For essentially the same reasons that the first term on the right of (16.134) was 
negligible, the last term here can be neglected (see Problem 16.38 again), so the 
equation of continuity becomes 


si 

3 1 


= ~~Q<y -v- 


( 16 . 137 ) 


The equations of motion (16.135) and continuity (16.137) give two equations for 
the three variables v, p', and q'. We get a third equation by looking at the definition 
(16.56) of the bulk modulus, p = BM (-dV/V). As stated, this relates the pressure 
to the corresponding change of volume, but it applies equally to a change in pressure, 
in which case it would read dp — BM (—dV/V). (For a moment we’ll call the volume 
of interest V, so dV can be its increment.) Now, the mass of a given element of fluid 
cannot change, so the quantity qV must be a constant. Therefore g dV + Vdg = 0, 
or dV/V = —dg/g. Combining these two results we see that 

dp = BM^. (16.138) 

e 

Now, in our case, the change of pressure dp is what we’ve been calling p'. Likewise, 
dg is what we’ve been calling g', and the original density is g 0 . Therefore, in our 
current notation 


p' = BM —. (16.139) 

We can use this last result to eliminate g' from the equation of continuity (16.137) 
to give 

^ = —BM V • v. 

3 t 

If we differentiate this with respect to t and invoke the equation of motion (16.135), 
we obtain 


3 2 p' 
3 W 


= —BM V • — 
3 1 


BM_ , 

-V z p 

Qo 


where in the second expression I interchanged the order of the spatial and time 
derivatives, and in the third expression I used the equation of motion (16.135). This 
result is the three-dimensional wave equation with wave speed 
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c — 



(16.140) 


I’ll show in just a moment that the waves in a fluid are necessarily longitudinal. Since 
longitudinal waves in a continuous medium are what we generally call sound waves, 
we have proved that the speed of sound in a fluid is c = 


example 16.9 Speed of Sound in Water 

Given that the bulk modulus of water is 2.2 GPa, what is the speed of sound in 
water? 

The density of water is, of course, 1000 kg/m 3 , so the speed of sound as given 
by (16.140) is 



2.2 x 10 9 N/m 2 
10 3 kg/m 3 


1.5 km/s. 


To show that waves in a fluid are longitudinal, let us imagine a wave traveling in 
the direction of a unit vector n, so that 

p = /( n .r - ct). (16.141) 

According to the equation of motion (16.135), this implies that 

^ = _± V y * -- V/(n -r - ct) = — -f(n -r - ct) 
dt e 0 Q 0 6o 

where /' denotes the derivative of / with respect to its argument. This relation can 
be immediately integrated to give the velocity of the fluid: 25 

v = —— [ f'(n -r — ct)dt — -^-/(n *r — ct) — (16.142) 

Q 0 J CQ 0 CQ 0 

(If you don’t see the integration here, try the change of variables £ = n • r — ct.) This 
result has two important features. First, we see that the fluid velocity is proportional to 
the pressure p'. In particular, in a sinusoidal wave, the velocity will oscillate in phase 
with p'. Second, the fluid velocity is in the direction of propagation; that is, the wave 
is longitudinal. As we anticipated in Section 16.11, a fluid cannot support transverse 
waves. 


25 In the third expression, I have omitted a constant of integration. This “constant” could depend 
on r (though certainly not on t) and could, in fact, be nonzero. For example, we could imagine a 
disturbance that included a constant uniform velocity superposed on the oscillatory wave motion. 
However, such a time-independent velocity is not part of what we would describe as the wave, so 
we can take it to be zero in our discussion of the wave motion. 
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Principal Definitions and Equations of Chapter 16 

The One-Dimensional Wave Equation 


The one-dimensional wave equation is 

d 2 u _ 2 3 2 m 
dt 2 dx 2 

The general solution is 


[Eq. (16.4)] 


u(x, t ) = f(x - ct ) + g(x + 67) [Eq. (16.10)] 

the first term of which represents a disturbance traveling to the right and the second 
one traveling to the left. 

The Three-Dimensional Wave Equation 

=c 2 V 2 p. [Eq. (16.39)] 

Stress, Strain, and the Elastic Moduli 

stress = f° rCe , strain = fractional deformation. [Eqs. (16.58) & (16.59)] 
area 

stress 

elastic modulus (YM, BM, SM) =- - -. [Eq. (16.60)] 

corresponding strain 


The Stress Tensor 

The stress tensor is a 3 x 3 symmetric matrix X defined so that the surface force on 
a small element of area dA is 


F(dA) = XdA. 


[Eq. (16.66)] 


The Strain Tensor for a Solid 

If u(r) denotes the displacement of an element of solid from its original position r, 
the strain tensor is the 3 x 3 symmetric matrix E with elements 


1 /•'«! . ;, M 

2 (ar, 8 r,) 


which can be decomposed as the sum of two terms: 

E = e\ + E' 


[Eq. (16.81)] 


[Eq. (16.88)] 


where e = ^trE. 
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Generalized Hooke’s Law 

For an isotropic solid, the most general relation between the stress and strain tensors is 

X = ael + 0E' [Eq. (16.93)] 

where the constants a and fi can be related to the three elastic moduli. 

[Eqs. (16.97), (16.99) & (16.100)] 


Equation of Motion for an Elastic Solid 

The equation of motion for an isotropic elastic solid is the Navier equation: 

3 2 .. 

Q^-j=gg+ (BM + |SM) V(V • u) + SMV 2 u. [Eq. (16.117)] 


Waves in a Solid 

The speeds of longitudinal (or primary, or P) waves and of transverse (or shear, 
or secondary or S) waves are 


c long 


BM + |SM 



[Eqs. (16.119) & (16.120)] 


The Material Derivative in a Fluid 

The time rate of change of a parameter g (the density, temperature, or velocity) as we 
follow a material element of a moving fluid is the material derivative 

— = — + v • V£. [Eq. (16.121)] 

dt 3 1 

Equation of Motion for an Inviscid Fluid 

dx 

q— = eg -Vp. [Eq. (16.124)] 


Equation of Continuity 

Conservation of mass implies the equation of continuity: 


—^ + V • (g>v) = 0. 
at 


[Eq. (16.130)] 
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Waves in a Fluid 


The speed of longitudinal waves in a fluid is 





but a fluid cannot support transverse waves. 


[Eq. (16.140)] 


Problems for Chapter 16 _ 

Stars indicate the approximate level of difficulty, from easiest (*) to most difficult (*★★). 

section 16.1 Motion of a Taut String 

16.1 ★ Verify that the quantity c = y/T / p that appears in the wave equation for a string does indeed 
have the units of a speed. 

16.2 ★* The wave equation (16.4) is the equation of motion for a continuous string, as illustrated in 
Figure 16.1(a). You can obtain this equation as the limit as n -» oo of the equations for the n discrete 
masses of Figure 16.1(b). You need to be careful with the limiting process. As n oo, the spacing b 
between the masses (see Figure 16.20) and the individual masses m must both go to zero in such a way 
that the linear mass density m/b approaches p, the density of the continuous string. You can guarantee 
this by taking m = pb. Write down Newton’s second law for the position u i of the ith mass and show 
that it goes over to the wave equation as b -» 0. 

section 16.2 The Wave Equation 

16.3 * Let /ff) be an arbitrary (twice differentiable) function. Show by direct substitution that fix — 
ct ) is a solution of the wave equation (16.4). 

16.4 ★ Show that if we make the change of variables £,= x — ct and r] = x + ct, then, as in (16.7), 

3 2 u c 2^ 2 “ _ 4 c 2 d du 
3 1 2 dx 2 3£ dr] 

16.5 * (a) Show that u = g(x + ct ) is a solution of the wave equation (16.4) for any twice differentiable 
function g(£). (b) Argue clearly that this solution represents a disturbance that travels undistorted to 
the left. 



Figure 16.20 Problem 16.2 
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16.6 * There is a small flaw in Example 16.1 (page 686). In Equation (16.14) I omitted a constant of 
integration, so the equation should really have read fipc) — g(x) = k. Show that this still gives the 
same final answer (16.15). 

16.7 ★★ [Computer] Make plots of the two triangular waves of Example 16.1 (page 686) at several 
closely spaced times and then animate them. Describe the motion. For the purposes of the plot you 
may as well take the speed c, the height of the triangle at time 0, and the half width of the base all equal 
to 1. Make your plots for lots of times ranging from t = — 4 to 4. 

16.8 ★★ [Computer] Make plots similar to Figure 16.5 of the standing wave (16.18) for several equally 
spaced times from t = 0 to r, the period. Take 2A = 1 and k = to = 2it. Animate your pictures and 
describe the motion. 

section 16.3 Boundary Conditions; Waves on a Finite String 

16.9 ** The motion of a finite string, fixed at both ends, was determined by the wave equation (16.19) 
and the boundary conditions (16.20). We solved these by looking for a solution that was sinusoidal in 
time. A different, and rather more general, approach to problems of this kind is called separation of 
variables. In this approach, we seek solutions of (16.19) with th e separated form u(x, t ) = X(x)T(t), 
that is, solutions that are a simple product of one function of x and a second of t. [As usual, there’s 
nothing to stop us trying to find a solution of this form. In fact, there is a large class of problems 
(including this one) where this approach is known to produce solutions, and enough solutions to allow 
expansion of any solution.] (a) Substitute this form into (16.19) and show that you can rewrite the 
equation in the form T"(t)/T{t) = c 2 X"(x)/X(x). (b) Argue that this last equation requires that both 
sides of this equation are separately equal to the same constant (call it K). It can be shown that K 
has to be negative. 26 Use this to show that the function T (t ) has to be sinusoidal — which establishes 
(16.21) and we’re back to the solution of Section 16.3. The method of separation of variables plays an 
important role in several areas, notably quantum mechanics and electromagnetism. 

16.10 ★★ Using the integral (16.33), show that the Fourier coefficients of the triangular wave of Figure 
16.7 are zero for n even and given by (16.34) for n odd. 

16.11 ** [Computer] Make plots similar to Figure 16.8 of the wave of Example 16.2 but from t = 0 
to r, the period, and for more closely spaced times. Animate your pictures and describe the motion. 

16.12 ** Consider a semi-infinite string, fixed at the origin x = 0 and extending far out to the right. 
Let /(£) be a function that is localized around the origin, such as the function of Figure 16.4(a). 
(a) Describe the wave given by the function f{x + ct) for a large negative time t 0 . (b) One way to 
solve for the subsequent motion of this wave on the semi-infinite string is called the method of images 
and is as follows: Consider the function u = fix + ct) — f(—x + ct). (The second term here is called 
the “image.” Can you explain why?) Obviously this satisfies the wave equation for all x and t. Show 
that it coincides with the given wave of part (a) at the initial time t 0 and everywhere on the semi-infinite 
string. Show also that it obeys the boundary condition that u = 0 at x = 0. (c) It is a fact that there is 
a unique wave that obeys the wave equation and any given initial and boundary conditions. Therefore 
the wave of part (b) is the solution for all times (on our semi-infinite string). Describe the motion on 
the semi-infinite string for all times. 


26 Actually this isn’t hard to show. Look at the equation X"(x)/X(x) = K/c 2 . You can show that if K > 0 
there are no solutions satisfying the boundary conditions that X(0) = X (L) = 0. 
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16.13 ★★ In connection with Equation (16.31), I claimed that any function on the interval 0 < x < L 
can be expanded in a Fourier series containing just sine functions. This is at first sight very surprising 
since one is used to the claim that the general Fourier series requires sines and cosines. In this problem, 
you’ll prove this surprising claim. Let fix ) be any function defined for 0 < x < L. We can define a 
function fix) for all x by setting it equal to the given function in the original interval and requiring 
that 


f(-x) = -f(x) and f(x + 2L) = f(x). (16.143) 


for all Jt. Prove that this defines a function which is (1) periodic with period 2 L, (2) odd, and (3) the 
same as the original f(x ) on the original interval. Write down the ordinary Fourier expansion for this 
new fix) and show that the coefficients of the cosine terms are all zero. This establishes the possibility 
of expanding the original function in terms of sines alone. 27 Bearing in mind that the period of the 
new function is 2L, write down the standard formula (5.84) for the expansion coefficients and show 
that your answer agrees with (16.33). The Fourier sine series is especially convenient for discussing 
functions that are zero at the end points x = 0 and L. 

16.14 *** [Computer] A taut string of length L = 1 is released from rest at t = 0, with initial position 


u(x, 0) 


2x [0 < x < f\ 

2(1 — x) [!<*<!]. 


(16.144) 


Take the wave speed on the string to be c = 1. (a) Sketch this initial shape and find the coefficients B n 
in its Fourier sine series (16.31). (b) Make plots of the sum of the first several terms for several closely 
spaced times between t = 0 and r, the period. Animate your plots and describe the motion. 


section 16.4 The Three-Dimensional Wave Equation 

16.15 ** Let f(0 be any function with first two derivatives f'iO and /"(£), and let n be an arbitrary 
fixed unit vector, (a) Show that V/(n • r - ct) = n/'(n • r - ci). (b) Hence show that /(n • r - ct) 
satisfies the three-dimensional wave equation (16.38). (c) Argue that /(n • r — ct) represents a signal 
that is constant in any plane perpendicular to n (at any fixed time t) and propagates rigidly with speed 
c in the direction of n. 

16.16** Let f(r) be any spherically symmetric function; that is, when expressed in spherical polar 
coordinates, (r, 0, 0), it has the form fir) = fir), independent of 6 and 0. (a) Starting from the 
definition (16.38) of V 2 , prove that 




(b) Prove the same result using the formula inside the back cover for V 2 in spherical polar coordinates. 
(Obviously, this second proof is much simpler, but the hard work is hidden in the derivation of the 
formula for V 2 .) 


27 But note that it has the form B n sminnx/L). The usual Fourier series has sines and cosines, but their 
argument is 2nnx/L. Thus the new Fourier sine series has, in a sense, twice as many terms to make up for having 
only sines. 
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section 16.6 Stress and Strain: the Elastic Moduli 

16.17 ** In Section 16.1 we derived the wave equation for transverse waves in a taut string. Here 
you will examine the possibility of longitudinal waves in the same string. Suppose that an element of 
string whose equilibrium position is x is displaced a short distance in the x direction to x + u{x, t). 
(a) Consider a short piece of string of length / and use the definition (16.55) of Young’s modulus YM 
to show that the tension is F = A YM du/dx, where A is the cross sectional area of the string. [If the 
string is already in tension in its equilibrium position, this F is the additional tension, that is, F = 
(actual — equilibrium).] (b) Now consider the forces on a short section of string dx and show that u 
obeys the wave equation with wave speed c = ^YM /q where q is the density (mass/volume) of the 
string. 


section 16.7 The Stress Tensor 


16.18 * Figure 16.15 is an end view of a triangular prism, whose three faces are labeled by the vectors 
dA h etc. (The magnitude of dA x is the area of the corresponding surface and the direction is normal to 
it. There are two more faces parallel to the plane of the paper, but these do not concern us.) The ends 
of the three faces form a closed triangle. Expain clearly why this implies that dA y + dA 2 + dA 3 = 0. 

16.19 * Let it! and n 2 be any two unit vectors and P a point in a continuous medium. F(n, dA) is the 
surface force on a small area dA at P with unit outward normal n b so n 2 - Fin, dA) is the component of 
that force in the direction of n 2 . Prove Cauchy’s reciprocal theorem that n 2 • F(n { dA ) = n] • F(n 2 dA). 


16.20 ** It is found that the stress tensor at any point (x, y, z) in a certain continuous medium has the 
form (with an unspecified, convenient choice of units) 


xz z 2 O' 
z 2 0 -y 
0 -y 0 


(16.145) 


Find the surface force on a small area dA of the surface x 2 + y 2 + 2z 2 = 4 at the point (1, 1, 1). 


16.21 ** At any given point P of a continuous medium, the surface forces are given by the stress tensor, 
which is a real symmetric matrix Z. It is a well-known theorem of linear algebra (see the appendix) that 
any such matrix can be brought into diagonal form by a suitable rotation of the Cartesian coordinate 
axes. Use this to prove that at any point P there are three orthogonal directions (the principal stress 
axes at P) with the property that the surface force on any surface normal to one of these directions is 
exactly normal to the surface. 

16.22 ★★★ Show that if the stress tensor Z is diagonal (all off-diagonal elements zero) with respect to 
any choice of orthogonal axes, then it is in fact a multiple of the unit matrix. This gives an alternative 
and elegant proof that if there are no shearing stresses (in any coordinate system) then the pressure 
forces are independent of direction. To do this problem, you need to know how the elements of a tensor 
transform as we rotate our coordinate axes, as described in Section 15.17. Assume that with respect to 
one set of axes Z is diagonal but that not all three diagonal elements are equal. (For example, cr n =£ cr 33 .) 
It is not hard to come up with a rotation — that of Equation (15.36) will do — such that in the rotated 
system o' u ^ 0. 
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section 16.8 The Strain Tensor for a Solid 

16.23 * An important tool in the development of the strain tensor was the decomposition (16.79) of a 
matrix M into its antisymmetric and symmetric parts. Prove that this decomposition is unique. [Hint: 
Show that if M = M A + M s where M A and M s are respectively antisymmetric and symmetric, then 
M a = i(M — M) and M s = $(M + M).] 

16.24* Write out the components of the displacement (16.77), u(r) = 0 x r, for a small rotation 6 
and verify that the derivatives matrix is given by Equation (16.78). 

16.25 *** At a certain point P (which you can choose to be your origin) in a continuous solid, the 
strain tensor is E. Assume for simplicity that whatever displacements have occurred left P fixed and 
the neighborhood of P unrotated, (a) Show that the x axis near P is stretched by a factor of (1 + e n ). 
(b) Hence show that any small volume around P has changed by dV/V = tr E. This shows that any two 
strains that have the same trace dilate volumes by the same amount. In the decomposition E = el + E' 
(16.88), the spherical part el changes volumes by the same amount as E itself, while the deviatoric 
part E' doesn’t change volumes at all. 

section 16.9 Relation between Stress and Strain: Hooke’s Law 

16.26* The table below gives the three elastic moduli for several materials. According to (16.100) 
Young’s modulus for any given material can be calculated if we know the bulk and shear moduli. 
Using the data for BM and SM, calculate YM for each of the materials and compare with the given 
values in the third column. (The densities will be needed for Problem 16.32.) 

Elastic Moduli (in GPa) and Densities (in g/cm 3 ) 


Material 

BM 

SM 

YM 

Q 

Iron 

90 

40 

100 

7.8 

Steel 

140 

80 

200 

7.8 

Sandstone 

17 

6 

16 

1.9 

Perovskite 

270 

150 

390 

4.1 

Water 

2.2 

0 

0 

1.0 


16.27 *** Consider a taut wire or rod lying along the x axis. To define Young’s modulus YM we apply a 
pure tension along the axis; that is, a stress with a n > 0 and all other cr /y = 0. (a) Use Equation (16.95) 
to write down the corresponding strain tensor E. (b) Argue from the definition (16.55) of Young’s 
modulus to show that YM = cf n /e n . (c) Combine these two results to verify the expression (16.100) 
for YM, showing in particular that YM = 9 BM • SM/(3 BM + SM). 

16 . 28 *** Consider again the wire or rod of Problem 16.27. In general, when one stretches the wire 
longitudinally it will contract in the transverse directions. The ratio of the transverse fractional contrac¬ 
tion to the longitudinal fractional stretch is called Poisson’s ratio (after the French mathematician and 
student of Laplace and Lagrange, 1781-1840) and is denoted by the Greek letter “nu,” v. (a) Show that 
v = — 622 /^ 11 -(b) Use the method of Problem 16.27 to show that v = (3BM — 2SM)/(6BM + 2SM). 
(c) Calculate Poisson’s ratio for the five materials listed in Problem 16.26. Comment on its value for 
materials with BM » SM. 

16.29 *** When we change our coordinate axes, the strain tensor changes in accordance with Equation 
(15.132), which we can rewrite as E^ = RER, where R is the (3 x 3) orthogonal rotation matrix. Use 
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the property (15.129) of orthogonal matrices to show that trE^ = tr E; that is, the trace of any tensor 
is rotationally invariant. Use this result to show that the decomposition E = el + E' is rotationally 
invariant, in the sense described below Equation (16.92). 

section 16.10 The Equation of Motion for an Elastic Solid 

16.30* If 8ji denotes the Kronecker delta symbol (16.115) and a is a vector with components cij 
(j = 1, 2, 3), prove that S^aj = %. In the same way, show that J2j ^ji^j — % a result we used in 
proving the important identity (16.116). 

section 16.11 Longitudinal and Transverse Waves in a Solid 

16.31 * A seismograph records the signals arriving from a distant earthquake. If the S waves arrive 12 
minutes after the P waves, how far away was the earthquake? Use the speeds found in Example 16.8 
(page 722). 

16.32 * [Computer] Using appropriate software, calculate the speeds of longitudinal and transverse 
waves in the five materials listed in Problem 16.26. Arrange for the software to give you a nice readable 
table of values. 

section 16.12 Fluids: Description of the Motion 

16.33 * Write down the equation of motion (16.124) as applied to a static fluid. Assuming that g is 
uniform and q is constant (independent of r), prove the well-known result from introductory physics 
that the pressure difference between two points r f and r 2 is just A p = Qgh, where h is the vertical 
difference in elevations of rj and r 2 . 

16.34 ** Equations (16.129) and (16.130) are two different forms of the equation of continuity. Prove 
that they are equivalent. 

section 16.13 Waves in a Fluid 

16.35 * A crucial step in showing that the waves in a fluid are necessarily longitudinal was the integral 
in (16.142). For an arbitrary function /(£), with derivative /'(!), prove that f f'(n • r — ct)dt = 
— /(n *r — ct)/c. 

16.36 ** To find the speed of sound in air using the result (16.140) requires a little care. (Even the great 
Newton got this one wrong!) The trouble is to decide on the correct value of the bulk modulus of air. 
Because the vibrations are so rapid, there is no time for heat transfer and the air expands and contracts 
adiabatically, so that pV Y = constant, where y is the so-called “ratio of specific heats,” y = 1.4 for air. 
(a) Show that the bulk modulus is BM = yp. (b) Use the ideal gas law, pV = nRT to show that the 
density is q 0 = pM/RT, where M is the average molecular mass of air (M ~ 29 grams/mole), (c) Put 
these results together to show that the speed of sound is c = JyRT/M. Find the speed of sound at 
0°C, and compare with the accepted value of 331 m/s. 

16.37 ** Show that the intensity / of a sound wave is proportional to the square of the pressure 
increment p'. To do this consider a small sliver of fluid normal to the direction of propagation with 
area dA. Write down the rate at which this sliver does work on the fluid just in front of it, then divide 
by dA to show that the time-average intensity is (/) = (p a )/ cq 0 . 
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16.38 ★** A crucial step in deriving the wave equation for waves in a fluid was the neglect of the first 
term on the right of Equation (16.134). (a) Justify this by using (16.139) to rewrite the right side of 
(16.125) as g'g — BMVp'/^o- Argue that the ratio of the first to the second term is of order gQ 0 X /BM, 
where X is a typical distance over which q' varies. (A good choice for X would be the wavelength of 
the proposed wave — of order a centimeter or at most a few meters.) Using the values for water (BM = 
2 GPa, etc.) show that the first term is negligible. (You would also reach the same conclusion for air.) 
(b) Show with a similar argument that the second term on the right of (16.136) is negligible. 
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Diagonalizing Real 
Symmetric Matrices 


A.1 Diagonalizing a Single Matrix 


In Chapter 10 we met the moment of inertia tensor I for a rigid body. With respect to 
an arbitrary set of orthogonal axes, I is a real symmetric 3x3 matrix which gives the 
body’s angular momentum L in terms of its angular velocity « as L = I«. We defined 
a principal axis as an axis with the property that if co points along the axis, then L is 
parallel to co, that is 


L = la) = Xco, 


(A.1) 


for some number X. We saw that if a body has three orthogonal principal axes, then 
with respect to these axes I has the diagonal form 


X 1 0 
0 X 2 
0 0 


0 

0 

^•3 


(A.2) 


where A.j, X 2 , and X 3 are the moments of inertia about the three principal axes. 
Conversely, if I has this diagonal form, then the axes with respect to which I was 
calculated were principal axes. For this reason, the process of finding the principal axes 
is often referred to as diagonalization of the inertia tensor. I claimed in Section 10.4 
that any rigid body, spinning about any origin O, does have three orthogonal principal 
axes. It is the main purpose of this appendix to prove that claim. 

The process of diagonalizing a matrix comes up over and over again in many 
different branches of physics. For example, if you read Chapter 16, you know that 
both the stress and strain tensors are given by real, symmetric matrices, and it is 
frequently convenient to find axes with respect to which one of these is diagonal. 
[For example, the axes with respect to which the stress tensor (at a given point P) 
is diagonal are called the principal axes of stress and have the tidy property that the 
stress along each of these axes is a pure stretch.] In quantum mechanics, probably 
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the most important thing to do with the operator that represents any given dynamical 
variable is to diagonalize it. 1 To emphasize the generality of the process, I shall denote 
the matrix that we wish to diagonalize by A, and since we frequently have occasion to 
diagonalize an n x n matrix, where n is any integer, not necessarily 3, we’ll suppose 
for now that A is an arbitrary n x n symmetric, real matrix. Nevertheless, the example 
that you may want to keep in mind is that A = I, the 3 x 3 moment of inertia tensor 
of a rigid body. In the context of classical mechanics, the matrices that we wish to 
diagonalize are almost always tensors (moment of inertia tensor, stress tensor, strain 
tensor, etc.), and in this section I shall assume that A is the matrix representing an 
n -dimensional tensor. 

Before we prove our main result, let us pause to consider the effect of changing 
the axes with respect to which the tensor of interest is evaluated. In general, if the 
n x n matrix A represents an arbitrary n -dimensional tensor (with respect to a given 
set of axes), then the matrix A' that represents the same tensor with respect to a 
different set of axes is given by the orthogonal transformation A' = RAR, where 
R is the orthogonal rotation matrix that relates the two sets of axes, as discussed 
in connection with Equation (15.132). Fortunately, if you haven’t yet studied the 
orthogonal transformations of tensors, you can still follow our main proof, if you 
are content to consider just the case that the matrix A is the moment of inertia tensor 
A = I. This matrix was defined by the sums (10.37) and (10.38) (or the corresponding 
integrals, for a continuous body). If we change our axes, the set of coordinates x,y,z 
is replaced by a different set x', /, z', and using these new coordinates naturally leads 
to a different 3x3 matrix T, and this is all you need to know about the relation of the 
two matrices I and I . 

We are now ready to prove the following important theorem: 


Diagonalization of a Real Symmetric Tensor 
If A is a real symmetric n x n matrix representing an n-dimensional tensor, then 
there exist n orthogonal unit vectors e 5 . • • •, e„ with the properties that (1) each 
e ( - is an eigenvector of A, that is 


Ae ; = A,-e ; - 


(A.3) 


for some real eigenvalue A t and (2) with respect to the axes defined by these n 
unit vectors, the tensor is represented by the diagonal matrix 


A' = 


o ... 

0 a 2 


(A.4) 


1 In quantum mechanics, the dynamical variables are represented by complex, Hermitian ma¬ 
trices (rather than real, symmetric matrices). However, the problem of diagonalizing them is very 
similar in the two cases. Here I shall discuss just the case of real symmetric matrices. 
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Before we prove this, notice that since the n unit vectors e 1? • • •, e„ are mutually 
orthogonal, they are certainly linearly independent. Therefore, any vector in the n- 
dimensional space can be expanded in terms of them; that is, ■ ■ ■ ,e n form an 
orthonormal basis of the space in which they lie. 

The proof of this result proceeds in several steps: 

Step 1. A has at least one eigenvalue and corresponding eigenvector. We 

have seen repeatedly that the eigenvalue equation (A.3) requires that 
det(A — XI) = 0. This determinant is an nth degree polynomial in X so it 
certainly has at least one zero, X = X 1? say. 2 Now, it is a well-known result of 
linear algebra 3 that, if det(B) = 0, then there exists at least one nonzero vector 
a such that Ba = 0. Therefore, since det(A — X t l) = 0, there exists at least 
one eigenvector a such that 


Aa = 


Xja. 


(A.5) 


Step 2. The eigenvalue Xj is real. Nothing we have said so far guarantees that 
the eigenvalue Xj and eigenvector a are real. To show that Xj is, consider the 
following: If we multiply (A.5) on the left by the row a* (that is, the complex 
conjugate of the transpose of the column a), we find that 


Xi = 


a*Aa 

a*a 


(A. 6) 


Now, it is easy to see that both terms in this fraction are real: First, 
a*a = a*a { = ^ \a t \ 2 > 0. 


Meanwhile the numerator is 

a*Aa = (a*Aa)~= aAa* = (a*Aa)* 

which shows that a*Aa is real. [For the first equality, I used the fact that the 
left side is a 1 x 1 matrix and hence equal to its transpose; for the second I 
used the well-known result that (mnp)~= pnm and that our given matrix A is 
symmetric; in the last, I used the fact that A is real.] Since both numerator and 
denominator in (A.6) are real (and the denominator nonzero), it follows that 
the eigenvalue X x is real. (Notice that this argument applies to any eigenvalue 
of A; thus, any eigenvalue of a real symmetric matrix is real.) 

Step 3. The eigenvector can be taken to be real. One might be tempted to expect 
that the eigenvectors of a real matrix are necessarily real, but this is actually 


2 In general, an nth degree polynomial has n zeroes, but some (or even all) of these can be equal. 
Nevertheless, we’re certainly safe in claiming that there is at least one zero. 

3 See, for example, Mathematical Methods for Scientists and Engineers by Donald A. McQuarrie 
(University Science Books, 2003), page 434, or Mathematical Methods in Physical Sciences by Mary 
Boas (Wiley, 1983), page 133. 
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false, for, if a real vector a satisfies (A.5), then so too does ra, which is certainly 
not real. Thus, an eigenvector of A can in general be complex. However, taking 
the complex conjugate of (A.5) and remembering that A and Aj are real, we 
see that, if a is an eigenvector, so is a*. This in turn means that both of the 
vectors a + a* and i (a - a*) are likewise. Since both of these are real and at 
least one is nonzero, we have shown that for every eigenvalue there is at least 
one real eigenvector. Therefore, we can, without loss of generality, assume 
that the eigenvector a is real. 

Step 4. Choose a new basis including the eigenvector. Our next step is to 

normalize the real eigenvector a and to choose a new orthonormal basis with 
this normalized eigenvector as its first unit vector. That is, we define the unit 
vector 


a 



which also satisfies the eigenvalue equation (A.5), 


(A.7) 


Ae t = A,e„ (A.8) 

and then choose n — 1 more unit vectors orthogonal to e, and to each other to 
define a new set of orthogonal axes. 4 With respect to this new basis, the vector 
ej is represented by a column whose first entry is a 1 and all of whose other 
entries are zero. The eigenvalue equation (A.8) implies that (with respect to 
our new basis) the first column of the matrix representing A has A j for its first 
entry and zeroes for all the rest. Since the matrix is symmetric, it follows that 
it has the form 


(new matrix A with respect to new basis) = 


Aj 

0 ••• 

o 1 




... o 

Aj 



(A.9) 


where is an (n — 1) x (n — 1) real symmetric matrix. 

Step 5. Repeat steps 1 through 4 on the matrix A t . The matrix A, can be viewed 
as acting on the (n — 1)-dimensional subspace orthogonal to our first new basis 
vector It has at least one real eigenvalue A 2 and a corresponding eigenvector, 
which we can take to be real and normalize to give our second unit vector e 2 . 
We now choose an orthonormal basis comprising e h e 2 , and n — 2 other unit 


4 In three dimensions, it is easy to see that this is always possible. Given e[ we just choose 
any two unit vectors in the plane perpendicular to ej. In n dimensions the argument is essentially 
the same; to make it watertight, one can use the Gram-Schmidt orthogonalization procedure. See 
Mathematical Methods for Scientists and Engineers by Donald A. McQuarrie (University Science 
Books, 2003), page 448. 
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vectors, and with respect to this second new basis, the matrix representing our 
tensor has the form 


(new matrix A with respect to second new basis) = 



0 

0 ••• 

0 " 

0 

x 2 

0 ••• 

0 

~0~ 

0 

a 2 


0 

0 




(A. 10) 


where A 2 is an (n — 2 ) x (n — 2 ) real symmetric matrix. 

Step 6. Repeat steps 1 through 4 on A 2 , then A 3 , etc. After n - 3 further 
repetitions, the matrix representing our tensor will have the diagonal form 
claimed in (A.4), and this completes our proof. 


A.2 Simultaneous Diagonalization of Two Matrices 


In Chapter 11 we saw that a system with n degrees of freedom, oscillating about a 
position of stable equilibrium, obeys an equation of motion of the form 

Mq = —Kq, (A. 11) 

where q is a column of n generalized coordinates, and M and K are n x n real 
symmetric matrices, called the mass and spring-constant matrices respectively. In 
what follows it is important that both M and K are positive definite matrices. To see 
what this means, consider first the matrix K: According to (11.53), the potential energy 
is U = iqKq. This is zero at the equilibrium position q = 0 and, since the equilibrium 
is stable, U must be greater than 0 for any q 7 ^ 0. Therefore, the matrix K must have the 
property that qKq > 0 for any q not equal to zero — the defining property of a positive 
definite matrix. Similarly, according to (11.54), the kinetic energy is T — ^qMq, and 
this must be positive for any q 0; that is, M also must be positive definite. 

We defined a normal mode as any motion in which all n coordinates oscillate 
sinusoidally at the same frequency co, so that q(r) = Re(ae lcot ), and we saw that a 
normal mode is possible if and only if co and a satisfy 

Ka = co 2 Ma. (A. 12) 

I claimed (and in some specific examples we saw explicitly) that there are n indepen¬ 
dent solutions a of this generalized eigenvalue equation and hence that any possible 
motion can be expressed as a linear combination of the normal modes. 
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To prove this claim, notice that if we expand a solution of (A.l 1) in terms of the n 
independent solutions a of (A. 12), then the new generalized coordinates q' satisfy 5 

q'i = —a> 2 q' r (A. 13) 


Comparing (A. 13) with (A.l 1), we see that we have to prove that there exists a basis 
of the rc-dimensional configuration space with respect to which the matrices M and 
K are both diagonal, with the forms 


"1 • 

•• 0“ 



r /, .2 

•• 0 - 

_0 • 

•• 1_ 

and 

K' = 

_ 0 • 

•• °>L 


In particular, in the new basis, the mass matrix is just the unit matrix. We shall prove 
this, although we shall see that in general the new basis is not orthonormal. The proof 
leans heavily on our previous proof and, like it, proceeds in several steps. 

Step 1. Diagonalize M. Since M is real and symmetric, we can find an n x n 
orthogonal matrix R which diagonalizes M. That is, M' = RMR is diagonal, with 
the form 


M' = RMR = 


V -1 
0 


0 

V>n 


(A.15) 


If we define q' = Rq and K' = RKR, then, with respect to the new coordinates, the 
eigenvalue equation (A. 12) becomes K'a' = co 2 MV. 

Step 2. Rescale coordinates so that M" = 1. In terms of the new coordinates q', the 
kinetic energy is 




(A. 16) 


Since this must be positive for any q' ^ 0, all of the numbers /z ; - in (A.15) must be 
positive. Therefore, we can scale each of the coordinates q' up by a factor of ^[Wi- 
Specifically, we’ll define new coordinates q’! = q-^/Wi- If, in addition, we define a 
diagonal (though not orthogonal) matrix 


S = 


VVJTi 

0 


0 ' 

i/>;_ 


(A. 17) 


and set 


M" = SM'S = 1 and 


K" = SK'S, 


(A. 18) 


then, with respect to these new coordinates, the mass matrix is just the unit matrix, the 
kinetic energy has the simple form T = 2 ■> anc ^ most important, since M" = 1, 


5 Notice that the coordinates that I am here calling q'. are precisely the normal coordinates that 
I introduced in Section 11.7 (where I called them 4). 
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the generalized eigenvalue equation Ka = w 2 Ma has become the ordinary eigenvalue 
equation 


K"a" = a/a". (A. 19) 

Step 3. Diagonalize K". According to Section A.l, there is an orthogonal matrix T 
which diagonalizes K". That is, if we define 

K'" = TK"T and M"' = TM"f = % (A.20) 

then both K'" and M'" are diagonal matrices, with M w still equal to 1. This proves the 
existence of the required n eigenvectors with the advertised properties. 6 


6 One small point: Since the eigenvalues are supposed to be the squares of the normal frequencies, 
it is essential that they be positive. This is assured since K and hence K"' are positive definite. 
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• Keith Symon, Mechanics (3rd edition, Addison-Wesley, 1971) 

• Stephen Thornton and Jerry Marion, Classical Dynamics of Particles and 
Systems (5th edition, Thomson, 2004) 


More advanced texts on classical mechanics 

• Louis Hand and Janet Finch, Analytical Mechanics (Cambridge University 
Press, 1998) — an undergraduate text, but distinctly more advanced than any 
of the above. 

• Herbert Goldstein, Charles Poole, and John Safko, Classical Mechanics (3rd 
edition, Addison-Wesley, 2002) — an astonishingly successful and enduring 
graduate text, first published in 1950. 


Books on mathematical methods 

• Mary Boas, Mathematical Methods in the Physical Sciences (2nd edition, 
Wiley, 1983) — an undergraduate text, beautifully written and comprehensive; 
still one of the best around. 

• Donald McQuarrie, Mathematical Methods for Scientists and Engineers 
(University Science Books, 2003) — a new entry at the undergraduate level. 
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Further Reading 


which has received glowing reviews — readable and very comprehensive, with 
1161 pages. 

Jon Mathews and R. L. Walker, Mathematical Methods of Physics (2nd edition, 
W. A. Benjamin, 1970) — a graduate text, but very accessible. 


Tables of integrals and other mathematical formulas 

• M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, 
1965). 

• H. B. Dwight, Tables of Integrals and Other Mathematical Data (4th edition, 
MacMillan, 1961) 

• Alan Jeffrey, Handbook of Mathematical Formulas and Integrals (2nd edition, 
Academic Press, 2000) 


Books on Chaos 

• James Gleick, Chaos, Making a New Science (Viking-Penguin, 1987) — a 
highly readable, nontechnical history of chaos theory. 

• Gregory Baker and Jerry Gollub, Chaotic Dynamics: An Introduction (2nd 
edition, Cambridge University Press, 1996) — a pioneering undergraduate text 
on chaos, which naturally covers much more ground than here. 

• Steven H. Strogatz, Nonlinear Dynamics and Chaos (Addison-Wesley, 
1994) — a beautiful, mathematical account of many aspects of chaos theory. 


Books on relativity 

• Albert Einstein, Relativity (15th edition, Crown, 1961) — a readable account 
by the great man himself. 

• Wolfgang Rindler, Introduction to Special Relativity (Oxford University Press, 
1991) — a very nice account that goes a bit further than here, by one of the 
experts. 

• James Hartle, Gravity: An Introduction to Einstein’s General Relativity 
(Addison-Wesley, 2003) — an excellent account of the general theory written 
for undergraduates. 

• C. W. Misner, K. S. Thome, and J. A. Wheeler, Gravitation (Freeman, 1970) — 
a classic, and still the most comprehensive text on general relativity. 

Books on continuum mechanics 

• Gerard Middleton and Peter Wilcock, Mechanics in the Earth and Environ¬ 
mental Sciences (Cambridge University Press, 1994) 

• D. S. Chandrasekharaiah and Lokenath Debnath, Continuum Mechanics 
(Academic Press, 1994) 

• Lawrence E. Malvern, Introduction to the Mechanics of a Continuous Medium 
(Prentice Hall, 1969) 



Answers for 
Odd-Numbered Problems 

Chapter 1 _ 

1.1 b + c = 2x + y + z, 5b + 2c = 7x + 5y + 2z, b • c = 1, b x c = x - y - z. 

1.5 9 = arccos ^2/3 = 0.615 rad or 35.3°. 

1.11 The particle moves counterclockwise around the ellipse (x/b) 2 + (y/c) 2 = 1 in the xy plane, 
making one complete orbit in a period lit/oo. 

1.23 v = (Ab - b x c )/b 2 . 

1.25 Any solution has the form f(t) = Ae~ 3 ‘, which contains one arbitrary constant. 

1.27 As seen from the ground the puck travels straight across the turntable passing through the 
center O. As seen by an observer sitting on the turntable, the puck follows a curving path as shown. 

A 

As seen from ground As seen from turntable 

1.35 The position is r = (v 0 t cos 6, 0, v 0 t sin# - {gt 1 ). The time to return to ground is 
t = (2u 0 sin 9)/g, and the distance traveled is (2v 2 sin 9 cos 9)/g. 

1.37 (a) If we measure x straight up the slope, x = v 0 t — \gt 2 sin 9. (b) Time to return, 
t = 2 v 0 /(g sin 0 ). 

1.39 x = v Q t cos 9 — \gt 2 sin 0, y = v 0 t sin@ - \gt 2 costy, z = 0. 

1.41 Tension = mco 2 R (or mv 2 /R ). 
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1.47 (a) p = sjx 1 + y 2 ,0 = arctan(y /x) (chosen to lie in the correct quadrant), and z is the same 
as in Cartesians. The coordinate p is the perpendicular distance from P to the z axis. If we use r 
for the coordinate p, then r is not the same thing as |r| and f is not the unit vector in the direction 
of r [see part (b)]. 



(b) The unit vector p points in the direction of increasing p (with 0 and z fixed), that is, directly 
away from the z axis; 0 is tangent to a horizontal circle through P centered on the z axis 
(counterclockwise, seen from above); z is parallel to the z axis, r = pp + zz. 

(c) a p — p — p0 2 , a ( f ) = pcj + 2p0, a z = z. 


1.49 0 — 0 O cot and z = z 0 + v oz t — \gt 2 . 


1.51 In the picture, the solid curve is a numerical solution of the differential equation (found with 
Mathematica’s NDSolve). The dashed curve is the small-oscillation approximation (1.57) with 
the same initial condition (0 O = n/2). Considering how large the initial angle is, the small-angle 
approximation does remarkably well. The only significant discrepancy is that the approximation 
oscillates somewhat too fast, as one would expect. (For large amplitudes, the true period is a little 
longer.) 



Chapter 2 


2.1 The two forces are about equal when v 1 cm/s, and if u 1 cm/s the linear force is 
negligible. For a beachball, the corresponding speed is about 1 mm/s. 


2.3 (b) R ~ 0.01, and it is very safe to neglect the quadratic drag. 
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2.7 If F = F 0 , a constant, v = v 0 + at, where a = FJm. 

2.11 (a) i£(0 = —v ter + (v 0 + v ter )e~ t/r and y(t) = -u ter t + (u 0 + u ter )r(l - e~ t/r ). 

(b) f top . = r ln(l + vjv ter ) and y max = [v 0 - v ter ln(l + v 0 /v ter )]r. 

2.13 v = — x 2 , where co = ^Jkjm, and x(t) = x 0 cos cot. 

2.15 Time of flight, t = 2 v y0 /g. 

2.19 (a) y= V -^x-ig . 

2.23 (a) Terminal speed = 22 m/s (ballbearing), (b) 140 m/s (steel shot), (c) 107 m/s (parachutist 
in free fall). 

2.27 Velocity, v(t) = i; ter tan ( arctan —— t ); (time to top) = arctan —, where 

V u t er m / CU ter U ter 

v ter ~ y/mg sin( 0 )/c. 


2.29 

time (sec) 

0 

1 

5 

10 

20 

30 


actual speed (m/s) 

0 

9.7 

37.7 

48.1 

50.0 

50.0 


speed in vacuum (m/s) 

0 

9.8 

49.0 

98.0 

196.0 

294.0 


2.31 (a) Terminal speed, u ter = 20.2 m/s. (b) Time to ground, t = 2.78 s (2.47 in vacuum), and 
speed at ground, v — 17.7 m/s (24.2 in vacuum). 


2.33 (a) 

cosh(z)\ 

10- 

/ 


\ 

5- 

J , 


-3 

/ 

-5- 

3 


sinh(z) / 

-10- 



(b) sinh(z) 

(c) 
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2.35 (b) At t — 2r and 3r, the speed v is 96% and 99.5% of its terminal value. 

2.39 (a) 

m I 

t = - arctan 

•JJhC \ 

(b) v (m/s) 15 10 5 0 

t (s) 6.3 18.4 48.3 142 

The corresponding times if we neglect friction are (from Problem 2.26) 6.7, 20.0, 60.0, and oo. 
To neglect friction, compared to the quadratic air resistance, is quite good at higher speeds, but 
terrible at very low speeds. 

2.41 The velocity is u(y) = yj (v 2 + v^)e~ 2gy ^ v ^ — v 2 r , where u ter = s/mg/c; 

_y max = 17.7 m, compared with 20.4 in the vacuum. 

2.43 (a) The solid curve is the true trajectory, the dashed is that in a vacuum. 



—. 






\ 


\ t , t x(m) 

5 " 

i'o 

15 

\io 

2S\ 


(b) The true range is 17.7 m and the range in vacuum 24.8 m. 

2.45 (b) z = 3 + 4i = 5e 0 921i ; (c) z = 2e~ i7t / 3 = 1-/73. 

2.47 (a)z + w = 9 + 4/, z — w = 3 + 12/, zw — 50, z/w = —0.56 + 1.92/; (b)z + w = 
(4 + 273) + (473 + 2)/, z — w = (4 — 273) + (473 — 2)/, zw = 32/, z/w — 73 + /. 

2.49 (b) cos 36 = cos 6 (cos 2 0 — 3 sin 2 0 ) and sin 36 = sin 6 (3 cos 2 6 — sin 2 0 ). 

2.53 mi) x = qBv y , mv y — —qBv x , mi) z = qE. 

The motion of x and y is the same as in Figure 2.15, clockwise motion around a circle at constant 
angular velocity co = qB/m. Meanwhile, z = z 0 + v zo t + \a z t 2 , where a z = qE/m. The particle 
moves in a helix or spiral of constant radius around the z axis, with an increasing pitch as the 
motion in the z direction accelerates. 

2.55 (a) i> x = (ov y , v y - —co(v x — E/B), and v z = 0. 

(b) Udr = E/B. 

(c) v x = Ujj. + iVxo — ^ dr ) coscot, v y = — (v xo — u dr ) sin cot, and v z — 0. 

The transverse velocity (v x , v y ) goes steadily around a circle of radius (v xo - u dr ), with a constant 
drift u dr in the x direction superposed. 

(d) x = v dT t + R sin cot and y = R (cos cot — 1) 

where R = (v xo — t> dr ) /co. This trajectory is a cycloid, whose precise appearance depends on the 
initial velocity v x0 , as illustrated below for seven different values of v xo . Notice, in particular, that 
if v xo = v dr then R = 0 and the charge drifts straight through the fields, as we already knew. (The 
values of v x0 are shown as multiples of u dr .) 




3.3 The vectors v 2 and v 3 have equal magnitudes v 2 = v 3 = -Jl v Q , and are at 45° on either side 
of the initial direction. 



3.7 Final speed, v ^ 2100 m/s. Thrust ~ 2.5 x 10 7 N, a little bigger than the initial weight 
w 2.0 x 10 7 N. 

3.9 Minimum exhaust speed ~ 2400 m/s. 

3.11 (b) and (c) v = v ex In (m 0 /m) — gt & 900 m/s, compared to 2100 m/s in zero gravity. 

3.13 Height« 4.0 x 10 4 m. 


3.15 CM position, R = (1/6,0,0). 
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3.15 (cont) 



m 2 


3.17 The CM is about 4.6 x 10 3 km from the earth’s center. 

3.19 (a) The CM would follow the same parabola as the unexploded shell, (b) The second piece 
hits the gun that fired it. (c) No. 

3.21 X = Z = 0 and Y = 4R/3tt. 

3.23 Velocity of second piece is v - Av. The CM (empty circles) is at the midpoint of the line 
joining the two fragments and clearly continues on the same parabola as the grenade followed 
before the explosion. 



3.25 Final angular velocity, co = co Q {rJr) 2 . 

3.29 Final angular velocity, co = co 0 (R 0 /R) 5 = coJ32. 
3.31 Moment of inertia, / = \MR 2 . 

3.33 Moment of inertia, I = | Mb 2 . 

3.35 (a) 



(b) and (c) Either way, v = sin y. 
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3.37 (a) 



Chapter 4 


4.3 (a) W = 0, (b) W = % (c) W = n/2. 


4.7 (a) W (rj r 2 ) = -( my/3)(y 2 3 - y 3 ); U( r) = (my/3)y 3 . 

(b) 

(c) Uan = y2?P73. 

4.9 (b)x 0 = mg/&. 

4.11 



function / 

ay 2 + 2£yz + cz 2 
cos(axy 2 z 3 ) 
ar = a-y^: 2 + y 2 + z 2 


df/dx 

0 

— ay 2 z 3 sin(axy 2 z 3 ) 
ax/r 


df/dy 

lay + 2Z?z 
—2axyz 3 sin (axy 2 z 3 ) 
ay/r 


2 by + 2 cz 

—3 axy 2 z 2 sin(axy 2 z 3 ) 
az/r 


4.13 function 9/9x 9/9y 9/3z 

ln(r) x/r 2 y/r 2 z/r 2 

r" nxr n ~ 2 nyr n ~ 2 nzr n ~ 2 

g(r) g'(r)x/r g'(r)y/r g'(r)z/r 


4.15 Using (4.35) we get A/ ~ 0.44, compared with the exact A/ = 0.45 (to two figures). 

4.19 (a) The surface x 1 + 4y 2 = K is an elliptical cylinder, centered on the z axis, with “radius” 
\/~K in the x direction and half that in the y direction, (b) The unit normal to the surface is 
n = (1, 4, 0)/a/17 (or — n). The direction of maximum increase is n (and maximum decrease 
is —n). 


4.21 The gravitational potential energy is U (r) = —GMm/r. 

4.23 (a) F is conservative and U = —±k(x 2 + 2 y 2 + 3z 2 ). (b) F is conservative and U = —kxy. 
(c) F is not conservative. 
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4.29 (a) 

E- 


(b) The time to reach A is t(0 -* A) = ^Jm/2k dx/JA 4 — x 4 . The period is 4t(0 —>■ A), 
(d) r = 3.71. 




(c) For b < r there can be two further equilibrium points (symmetrically placed on either side of 
0 = 0 ), both of which are unstable. 

4.35 (a) E = |(mg + m 2 + I/R 2 )x 2 — (mj — m 2 )gx (plus a constant that we may as well drop), 
(b) Equation of motion is (m l + m 2 + I/R 2 )x = (m, — m 2 )g, either way. 

4.37 (a) 1/(0) = MgR{\ - cos0) - mgR<p. 

(b) There are equilibrium positions only if m < M. If m = M, there is one equilibrium (unstable) 
at <p = 90°. If m < M, there are two positions determined by the condition m = M sin <f>, which 
has two solutions symmetrically placed above and below <p — 90°. The lower position is stable, 
the upper unstable. [See the pictures of part (c)]. 



If m = 0.7 M, the wheel swings up to a maximum 0 < n , then swings back to 0 = 0, and oscillates 
indefinitely. If m = 0.8M, the wheel swings past 0 = n and continues to rotate counterclockwise 
until the string runs out. (d) The critical value of m/M is 0.725. 

4.39 (c) If O = 45°, this approximation gives r = 1.037r o , which represents a 3.7% correction to 
the small-amplitude approximation (r 0 ), and is itself within 0.3% of the exact answer (1.040r o ). 
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U(r h r 2 ,r 3 ,r 4 > 

= [^ 12 ( 1*1 - r 2> + ^is( r l - r s) + ti 4 (ri - r 4> + %3(X2 - r 3 ) 

+ C 24 (r 2 — **4) + ^34( r 3 — r 4) 1 
+ [U* xt (r x ) + Uf{ r 2 ) + Uf(r 3 ) + U™\ r 4 ) ]. 


4.53 (b) E = T x + T 2 + U x + U 2 + C 12 — {m2 V 2 

ke 2 

(c) Long before: E = T X + T 2 + U x +0 + 0 ~ E 2 — — . 

2 r 

£ e 2 

Long after: E' = T[ + + 0 + U' 2 + 0 = T' - —. 

By conservation of energy, T[ = T 2 + {ke 2 (— - . 



Chapter 5 


5.3 U (0) = mg/ (1 - cos 0) and k — mgl. 

5.5 (a) B l = C 1 +_C 2 and B 2 = /(C, - C 2 ). 

(b) A = + B 2 and 6 = arctan(B 2 /B]), chosen in the right quadrant. 

(c) C = A<r i5 . (d) Cj = C/2 and C 2 = C*/2. 

5.7 (a) B x - x 0 and B 2 = vjco. (b) co = 10 s -1 , B x = 3 m, B 2 = 5 m. 

3 
0 
-3 

(c) The first time x = 0 is t — 0.26 s; the first time x = 0 is t — 0.10 s. 
5.9 Period, r = 1.05 s. 



5.11 


A = 


j x 2 v 2 - x 2 v 2 

V V 1 - V 2 


and 



5.13 


r 0 = XR and 


(O = 


I 2 U 0 

V mXR 2 


5.17 (a) If the fraction p/q is in its lowest terms, r = litp/t 
5.19 k' = 2k(2a - l Q )/a. 

5.23 dE/dt = x(mx + kx ). 
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5.25 (a) and (b) 



(c) With P = co 0 /2, the amplitude shrinks by a factor of 0.026 in one period (much more than 
the picture, for which ft was chosen to be m o /10 and the shrinkage factor is about 0.53). 

5.29 tj = 1.006 sec, and ft = 0.110m o . 

5.31 Each picture shows x(t) as a function of t for the value of p indicated. 



p = 10 p-20 



-l 


-1 


5.37 A = 26.9, 8 = 3.04 rad, B x = 26.7, and B 2 = -6.18. 



The solid curve is the actual motion; the dashed curve is the transient, homogeneous solution. 
5.43 (a)l:«4x 10 4 N/m. (b) / ~ 6 Hz. (c) v & 5 m/s or roughly 10 mph. 
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5.49 a 0 = / m ax/2 and a n = 4/ max /(rc7r) 2 , for n odd but 0 for n even (n > 0); b n — 0 for all n. 



The left picture shows the sum of the first two terms (the constant term plus the first cosine) and 
the “sawtooth function” itself. The right picture shows the sum of the first six terms. This follows 
the sawtooth so closely that it is hard to tell them apart except at the comers. 

5.53 A 0 = l/2&> 2 , A n — A/n 2 7X 2 ^{(D^ - n 2 co 2 ) 2 + (Iflnoff 1 for n odd but zero for n even (> 0). 
(a) With t g = 2, a> 0 = tt, and the first four coefficients A n (n = 0,1,2, 3) are 0.0507, 0.6450, 0, 
and 0.0006. 



(b) With r 0 = 3, co Q = 2tt/3, and the first four coefficients A n (n = 0,1,2,3) are 0.1140, 0.0734, 
0, and 0.0005. 



012345 012345 


The left picture shows the data for this problem; the right shows the data of Figure 5.26 (though 
drawn here to a slightly different scale). Notice that the resonances with >3 = 0.1 are twice as high 
and half as wide as those for fi = 0.2. 


Chapter 6 _ 

6.3 Time to travel from P x via Q to P 2 is (^Jx 2 + yf + z 2 + yjix - x 2 ) 2 + y 2 + z 2 j /c. 
6.5 Time to travel from A via P to B is 2*j2(R/c) cos(0/2). 
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6.7 0 = az + b, where the constants a and b are chosen so the path passes through the given 
endpoints. In general there are many different paths of this form. 

6.9 y = sinh(x)/ sinh(l). 

6.11 The path is a parabola, x = C + (y — D) 2 /AC, with C and D constant. 

6.17 p = pj cos[(0 — 0 o )/\/l + A. 2 ]. With X = 0, the cone becomes a plane, and this equation 
becomes the equation of a straight line. 

6.23 (a) v = ■ s /(v 0 cos(f> + Vy) 2 + (u o sin0) 2 . (c) y max = 366 miles; (time saved) = 27 minutes. 

Chapter 7 _ 

7.1 £ = {mix 2 + y 2 + z 2 ) — mgz. The three Lagrange equations are 0 = mi, 0 = my, and 
—mg = mi. 

7.3 £ = \m(x 2 + y 2 ) — {k(x 2 + y 2 ). The two Lagrange equations are mx = —kx and my = 
—ky. This is the isotropic oscillator of Section 5.3. 

(V/) r = — and (V/), = IM. 
r dr 0 rd(p 

7.7 (a) m a r a = —W a U, [a = 1, • • •, N], (b) £ = £ a {m a r 2 - U(r h ■ ■ •, r^). 

7.9 x — Rcos4>, y ~ R sin <p, and 0 = arctan(y/x), chosen to lie in the correct quadrant. 

7.11 x — A cos cot +1 sin 0, y = l cos 0, and 0 = arctan[(x — A cos cot)/y]. 

7.15 £ = \{m x + m 2 )x 2 + m 2 gx, a M gm 2 /(m x + m 2 ). 

7.17 x = g{m x - m 2 )/(m 1 + m 2 + I/R 2 ). 

7.21 £ = |m(r 2 + r 2 co 2 ), and r = Ae wt + Be~ wt . 

7.23 £ = \m{x — Ao>sin<u0 2 — {kx 2 . 

7.27 (Acceleration of mass 4m) = g/7 downward. 

7.29 £ = \m [ R 2 (o 2 + / 2 0 2 + 2 Rlccxp sin(0 — oot)\ — mg(R sin cot — l cos 0) and 
l(f> = —g sin0 + co 2 R cos (0 — cot). 

7.31 (a) £ m {(m + M)x 2 4- \M (L 2 0 2 + 2iL0cos0) — \kx 2 + MgL coscf>. The x and 0 
equations are 

(m + M)x + ML (0 cos 0 — 0 2 sin 0) = —kx and M(L0 + x cos 0) =—Mg sin 0. 

(b) With x and 0 both small, these become 

(m + M)x + ML0 = -&x and M(L0 + x) = —Mg0. 

7.33 x(t) = x 0 cosha»t + (g/2co 2 ) (sin cot — sinhwt). 

7.35 £ = {mR 2 [co 2 + (0 + co) 2 + 2co(<p + co) cos0]. For small oscillations about B, the angu¬ 
lar frequency is co. 
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7.37 (a) & = mr 2 + ±mr 2 <p 2 — mgr. (b) The r and 0 equations are mr0 2 — mg = 2 mr and 
mr 2 0 = const, (c) r 0 = [l 2 / {m 2 g)] x ^. (d) Angular frequency = >/3/2 l/mr 2 . 


7.39 (a) £ = > (r 2 + r 2 0 2 + r 2 sin 2 9 0 2 ) -U(r). 

(b) The r, 9, and 0 equations are 

mr = mr ^0 2 + sin 2 9 4> 2 ^ - dU/dr 


(mr 2 0^ = mr 2 sin 0 cos 9 <p 2 
~ (mr 2 sin 2 9 <p^j =0. 


(c) The motion remains in the equatorial plane 9 = n/2, consistent with our knowledge that the 
motion is confined to a plane. 

(d) The motion remains in the longitudinal plane (p = 0 O . 

7.41 L = \m (p 2 + p 2 co 2 + 4k 2 p 2 p 2 ) — mgkp 2 , and the equation of motion is 
(1 + 4 k 2 p 2 )p + 4k 2 pp 2 = ip) 2 — 2 gk)p. 

The bottom of the wire, p = 0, is an equilibrium, which is stable if co 2 < 2 gk, but unstable if 
co 2 > 2 gk. If co 2 = 2 gk, the bead is in equilibrium at any p, but the equilibrium is unstable (except 
at p = 0). 


7.43 (a) C = | (M + m)R 2 <p 2 - MgR( 1 — cos 0) + mgR(p, and the equation of motion is (Af + 
m)/?0 = —Mg sin 0 + mg. 

( b ) * u(d) 



Notice that there are actually several equilibriums, separated by one or more complete revolutions. 
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7.49 (b) £ = \mi 2 + r • A = \m{p 2 + p 2 <p 2 + z 2 ) + \qBp 2 <\), and the three Lagrange equations 
are ^ 

mp = mp(\) 2 + qBptp, — (mp 2 0 + \qBp 2 ) = 0, and mz = 0. 


7.51 L{x, y ) = \m(x 2 + y 2 ) + mgy. (a) The two modified Lagrange equations are 
x y 

X- = mx and mg + X— - my. 


Chapter 8 


8.3 y x = L + j±v 0 t - \gt 2 + ^sincur and y 2 = j±v Q t - ±gt 2 - ^ sin cot. 

8.7 (a) Period r = lirr 2 ’/ 2 /y/Gm 2 . (b) r = Inr^ 2 /*JGM. These two answers are the same in 
the limit that m 2 -» oo. (c) r = 0.71 years. 

8.9 (a) £> = \M(X 2 + Y 2 ) + i/x(r 2 + r 2 0 2 ) - {k(r - L) 2 . 

(b) MX = 0 and MY = 0, with solutions R = R 0 + R 0 r. (c) The r and <p equations are 
fir = firtfi 2 — k(r — L ) and /xr 2 0 = const. 

If r = const, then 0 = const, and r = L + fxr4> 2 /k. 

If (j) = const, then r = L + A cos (oot — 5), where oo -- ^Jk/fx — y/2k/m x . 


8.13 (a) 



(b) r 0 = (£ 2 / kfi) x/A . (c) Angular frequency of oscillations, oo = k/p. 

8.15 Percent variation ~ 0.1%. 

8.19 Eccentricity, e = 0.17; (height when on y axis) = 1424 km. 

8.21 (a) If l -► 0, then a -> r m .J2. (b) r ( ^ 0) = (jr/V2GM)(r max ) 3 / 2 . 

(c) * = (7r/2V2GM)(r max ) 3 / 2 . (d) and (e) r (fe0) = (2^/V2GM)(r max ) 3 / 2 = 2r (€ _ 0 ). 


8.23 (b) = «|/T+ mX/l 2 and c = £ 2 fi 2 /mk. (c) The orbit is closed if p is a rational number, 

P = p/<7 (where p and q are integers). If A 0, the orbit becomes a Kepler ellipse. 
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8.25 (a) r 0 = 1. 


0.1 

0 

- 0.1 

- 0.2 



(b) You can see from the picture on the left that if E — —0.1, the inner turning point is at 
about r min = 0.7. If we use this as a starting value for any equation-solving program (such 
as Mathematica’s FindRoot), we find that this root of the equation t/ eff (r) = —0.1 is actually 
r min = 0.6671. 

(c) Obviously the orbit shown on the right has not closed after 3.5 revolutions, and it clearly won’t 
close for a long time. (In fact, it never does, but this is harder to prove.) 

8.27 c = 8.87 x 10 7 km, e = 0.753, 8 = 1.72 rad. 

8.29 The new orbit would be a parabola, tangent to the old circular orbit at the point at which the 
great disappearance occurred. The earth would be just not bound. 




8.35 First thrust factor = y/2/5 ; second = ^/5/8. 

Chapter 9 _ 

9.1 (Angle of tilt) = arctan(A/g) with vertical. 

9.3 (a) F tid /mg ~ 1.1 x 10~ 7 . (b) Same magnitude, opposite direction. 
9.9 F cor = 2mv 0 Qcos6 due east; F COT /mg m 0.011. 

9.13 The maximum value of a is about 0.1°; the minimum is zero. 
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9.15 g = g 0 y/ cos 2 6 + X 2 sin 2 6 . 

9.19 (a) As seen from the ground, the puck moves in a straight line. As seen from the merry-go- 
round, its initial acceleration is radially outward; as it speeds up, it curves to its right and spirals 
outward from the center, (b) As seen from the ground, it remains stationary. As seen from the 
merry-go-round, it moves in a clockwise circle centered on the axis of the merry-go-round. 


9.21 


This is what happens in Problem 9.24(d). 
9.25 Angle = 0.13° to left. 



a 

(a) View from earth (b) View from space 



x (east) 


9.31 

9.33 


v = 0.11 mm/s. 




Chapter 10 


10.3 R = (0,0, H/5). 


10.5 R = (0,0,3/2/8). 

10.7 (a) V = ^;rR 3 (l ~ cos6> o); 
10.9 I = \MR 2 . 


(b) R = (0, 0, Z) where Z = 


3 R 
16 


1 — cos 2 6 0 
1 - cos 9 0 
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10.11 (a) /(solid) = -MR 2 -, (b) /(hollow) = -M- a — 

5 5 b 3 - a 3 


10.13 (a) w = s/mga/l-, (b) l m //(ma). 

10.15 w = 

10.17 I zz = ~M(a 2 + b 2 ). 

10.23 All products of inertia involving z are automatically zero, I xz m I yz = l zx = l zy = 0. 
10.25 (a) and (b) The inertia tensors I cm about the CM and 1^ about A are 


I cm = -M 
3 


I, = -M 
A 3 


2 + c 2 0 O' 

0 c 2 + a 2 0 
0 0 a 2 + b 2 


and 


4 (b 2 + c 2 ) —3 ab —3ac 

—3 ba 4(c 2 + a 2 ) —3bc 

—3 ca —3cb 4 (a 2 + b 2 ) 


(c) L = \M(o (4 (b 2 + c 2 ), —3ab, -3ac). 

10.27 


I =-M 
4 


(R z + 2 h z ) 

0 
0 


(R 2 + 2 h 2 ) 0 

0 2 R 2 


10.35 (a) 


I = ma 2 


10 0 0 ‘ 
0 6 1 
0 1 6 


(b) The principal moments are A.! = 10mn 2 , X 2 = Ima 2 , and A 3 = 5 ma 2 . The corresponding 
principal directions are e! = (1, 0, 0), e 2 = ^(0,1 , 1), and e 3 = -^(0,1 , —1). 

10.37 (a) 


1 = 


"2 1 0 “ 
1 2 0 
0 0 4 


(b) = 1, A 2 = 3, and A 3 = 4;e, = J=(l, 1,0), e 2 = J=(l, -1,0), and e 3 = (0,0,-1). 

10.39 Q 21 rad/s or about 200 rpm. 

10.47 About 1010 years. 

10.53 b 2 » 4ac. 

10.57 (a)C = ±M(X 2 + F 2 ) + iA^ m (0 2 sin 2 6 + 6 2 ) + + <p cosd) 2 - MgR cos 9 where 

A j m and X™ are the two principal moments about the CM. (c) The larger precession rate is bigger 
about the CM than about the tip. According to (10.111) the smaller rate is unchanged. [But note 
that (10.111) is an approximation; if we keep the next term in the approximation, we find that the 
smaller rate is slightly reduced.] 
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Chapter 11 


11.1 *i(*i - L i) = Hh - l 2 ) = h(i 3 - l 3 ). 

113 co 2 = ^ [m x {k 2 + k 3 ) + m 2 (k x + k 2 ) 

± y jml{k 2 -I- k 3 ) 2 + m 2 (k x + k 2 ) 2 - 2m x m 2 (k 2 k 3 + k 3 k x + k x k 2 - k%) J. 

11.5 (a) Let mj = m 2 = m and k x = k 2 = k, and co 0 = y/k/m. Then the normal frequencies are 


3-V5 


= 0.62 co 0 


and 


3 +V5 


= 1.62o) 0 . 


(b) In mode 1, the two carts oscillate in phase, with the amplitude of m 2 equal to 1.62 times that 
of m x . In mode 2, they oscillate exactly out of phase, with the amplitude of m 2 equal to 0.62 times 
that of m x . 


11.7 (b) B x = A and B 2 = C l = C 2 = 0. 

; twvw 

: 1wvw 

(b) 


WWW 

k*2 

(C) 


(c) B X = B 2 = a/2 and C X = C 2 = 0. 

11.9 (b) A x cos(co x t — 5 X ) and £ 2 = A 2 cos (co 2 t — 8 2 ), where co x = y/k/m and co 2 = y/3k/m, 
and A x , S x , A 2 , S 2 are all arbitrary constants. Hence 


x x = Ajcosfw^ — 5 X ) + A 2 cos(&> 2 ? — 8 2 ) 
x 2 = A x cos (cy x t — <5q) — A 2 cos(co 2 t — 8 2 ). 

11.11 (a) fflij = —2 kx x + kx 2 — bx x + F 0 cos cot 

mx 2 = kx i — 2 kx 2 — bx 2 

( c £ x (t) = A. x cos(a>t — + B x e~ pi cos(&>p - cSp 

£ 2 (f) = A 2 cos (cot - S 2 ) + B 2 e~ pt cos (eo 2 t — S*) 

where the constants A,, A 2 , <5,, S 2 are given by Eqs. (5.64) and (5.65) (except that now f 0 = F 0 /2m 
and, in the case of A 2 and 8 2 , co 2 is replaced by 3 co 2 )', the constants B x , B 2 , <5^, S 1 / in the 
transient terms are arbitrary and determined by the initial conditions, and co x = — j3 2 , while 

<H = Vh 2 _ ^ 2 - 

11.13 (b) Let ft = b/2m, co x = y/k/tn - ft 2 , and co 2 = yj(k + 2k 2 )/m - ft 2 . Then 

l x = e fSt (B x cos co x t + C x sin co x t) and £ 2 = e~ pt (B 2 cos co 2 t + C 2 sin co 2 t) 
where B x , C x , B 2 , C 2 are arbitrary constants. 
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(c) With the given initial conditions (and /3 <$C 1) 

jq = ^Ae~^ (cosoo^t + cos co 2 t) and x 2 = |Ae _/3f (cosn> 1 t — cos co 2 t). 

. *i. 

, *2 

11.15 L = T — U where T and £/ are given in (11.38) and (11.37). The ^ equation is 

(m ( + m 2 )L 1 2 0 1 + m 2 L l L 2 9 2 cos(9 l — 0 2 ) + m 2 L 1 L 2 0 2 2 sin(f?i — 0 2 ) 

= -(mj + m 2 )gL! sin^! 

and the 0 2 equation is 

m 2 L x L 2 9yCOs(dx — 0 2 ) + m 2 L^9 2 — m 2 L x L 2 9 2 s\n(9 l — 9 2 ) 

= —m 2 gL 2 sin0 2 . 

11.17 (a) o> 2 = |co 2 and co 2 = |« 2 . 

For the first mode, a = A j; for the second, a = A ^ j. 

(b) j = "7 | [3 j COSWl * ~ \ j COSa)2t }’ * s not P er *odic. 

11.19 (a) L = \(m + M)x 2 + MLxO + \ML Z 9 2 - (\kx 2 + \MgL9 2 ). The x and 9 equations 
are 

0 m + M)x + ML9 = -kx and MLx + ML 2 9 = —MgL9. 

(b) With the given numerical values, the normal frequencies are 

a)! = ^2->/2 = 0.77 and w 2 = ^2 + = 1.85. 

In the first mode, the cart and bob oscillate in phase (both moving to the right and then both moving 
to the left), with the bob’s amplitude (of motion relative to the cart) \/2 times bigger than the cart’s. 
In the second mode, the cart and bob oscillate exactly out of phase, again with the amplitude of 
the bob equal to a/ 2 times that of the cart. 

11.23 co 2 = £ + - and co 2 = $-+ 3-. 

L m L m 

11.25 If co 0 — s/kjm, then the normal frequencies are 
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In the first mode, all three carts oscillate in phase, with a x = a 3 = a 2 / V2; in the second, the middle 
cart is stationary, while the first and third oscillate exactly out of phase, with a x = —a 3 and a 2 = 0; 
in the third, the left and right carts oscillate in phase, while the middle one is exactly out of phase, 
with a x = a 3 — —a 2 j V2. 

11.27 (a) £ = + i 2 ) — {k(x x — x 2 ) 2 . The normal frequencies are ^ = 0 and co 2 = (o 0 *Jl 

(with co 0 = s/k/m). (b) In the second mode, the two carts oscillate with equal amplitudes, but 
exactly out of phase, (c) In the first mode, x x = x 2 = x 0 + v 0 t\ that is, they move at constant 
velocity with the spring at its equilibrium length. 

11.29 oj x = *J2k/m, co 2 = ■ s /6k/m, and co 3 = y/g/r Q , where r 0 is the equilibrium value of r. 
11.31 The three normal frequencies are 0, \flai G , and \/3 co 0 . 

11.35 (a) = 0i + 4> 2 and £ 2 = <j) l — 4> 2 (or any convenient multiple of these), (c) The first mode 

has = 0 2 = Ae~ pt cos^t — 5), and the second = —0 2 = Ae~^ cos (o) 2 t — 8), where 
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12.11 (b) The least-squares line has slope —1.54, giving 8 = e h54 = 4.66, compared with the 


correct value 4.67. 


ln(7n + l-7n) 



12.13 k ** 0.64. 

12.15 



The picture confirms that A0(7) decreases exponentially. 



12.19 (a) x = A cos(tf )t — (5), where oo = -Jkjm, and A and 8 are arbitrary constants. 


X 


(b) E = \kx 2 + ±mx 2 = const. 

12.21 (a) 



*0 


x 2 

*3 

* 4 

*5 

*28 

*29 

*30 

0 

1.00 

0.54 

0.86 

0.65 

0.793 • 

■ • 0.7391 

0.7391 

0.7391 

3 

-0.99 

0.55 

0.85 

0.66 

0.791 - 

• • 0.7391 

0.7391 

0.7391 

100 

0.86 

0.65 

0.80 

0.70 

0.765 • 

• • 0.7391 

0.7391 

0.7391 
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(b) There is a single, stable attractor at x* = 0.739085. 




(b) r 0 = 1/tt = 0.318. (c) rj = 0.720. 


12.25 


ji* r = 0.60 

: v—— 

0-T-T-t-1— 

0 10 20 


■■■* r - 0.79 


0 10 20 


r = 0.85 

:vVWW 


i-■-1-■-r-i 

0 10 20 




12.27 (c) x a = 0.5130 and x b = 0.7995. 

12.29 In (r n+1 -r n ) 



From the slope of the least-squares fit, we get S = 4.71. 
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'H 


10 20 30 40 t 


(a) 


12.33 See Figure 12.41. 


log|x'-x| 


10 20 30 40 t 



(b) 


Chapter 13 


13.1 = p 2 /2m. The Hamilton equations are x = p/m and p = 0, with solutions p = p 0 = 

const, and x = x 0 + v 0 t, where v Q = p 0 /m. 

13.3 The Hamiltonian is “K — p 2 /2(m { + m 2 + \M) — (m x — m 2 )gx. The Hamilton equations 
arei = p/(m l + m 2 + \ M) and p = (m } — m 2 )g, and the acceleration is x m g(m x — m 2 )/(m l + 
m 2 + \M). 

13.5 The Hamiltonian and the two Hamilton equations are 

_ £_ __ P 


- + mgc0, 0 = 


2m(c 2 + R 2 ) ’ m(c 2 + R 2 ) ’ 

Combining the last two gives z = c0 = — g c 2 j (c 2 + R 2 ). 


and p = —mgc. 


13.7 (a) % = 


2m[l + h'(x ) 2 ] 
*, _ P 


+ mgh(x). (b) The Hamilton equations are 
, • p 2 h'{x)h"{x) 


m[ 1 + h'{x) 2 ] 


m[ 1 + h'(x) 2 ] 2 


- mgh'(x). 


13.9 The Hamiltonian is “K = (p 2 + p 2 )/lm + mgy. The Hamilton equations are x = p x /m, 
p x = 0, >’ = p y /m, and p y — -mg. 

p 2 + P 2 + P 2 

13.11 The Hamiltonian is "K = --- p x V + mgz (with x measured along the tracks 

2m 

and z vertically up). 

13.13 ‘K — — (p 2 + — ^ + bk(R 2 + z 2 ), z = A cos {cot — 8), and 0 = const. 

2m \ z R 2 I 

13.17 (a) z 0 = [p 2 /(m 2 c 2 g)]V 3 (d) a = arcsin(l/>/3) = 35.3°. 

13.19 Oi = ^-(p 2 + p 2 ) + U(jx 2 + y 2 ). 

2m y 


1 


13.21 (a) IK = — (P*. + Pp + — 
2 M x y 2pL 




+ \k(r - l Q ) 2 . X, Y, and 0 are ignorable; r 
is not. (c) r — l 0 T A cos (cot — 8), where co = V k/p: and A and 8 are constants. 
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13.23 (b) The two conjugate momenta are p x = m(x + y) and p y = m(x + Ay). “K = — 

2m 

\\,{p x — Py^ 2 + Px] + ( c ) x = x o coscot and y = y 0 + |x 0 (l — cos cot), where 

co = ■sJAkj'im. 

13.25 (c) P = IK. (d) Q = t + const. 



13.31 (a) 3 k, (b) 0, (c ) k, (d) 0. 

13.33 (a) LHS = AnkR 3 = RHS. (b) Same. 


Chapter 14 


14.1 a — 0.79 cm 2 , n tar = 0.034 cm , probability = 0.027. 
14.3 (Number density) = 2.1 x 10 28 atoms/m 2 . 

14.5 N sc = 2.7 x 10 7 . 

14.7 A£2 moon = 6.45 x 10~ 5 sr, and A£2 sun = 6.76 x 10 -5 sr. 

14.11 29. 



(b) Jp 2 + 2 mU 0 . (c) See left picture. 
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da b , dO _ 

— =- where — =2 

dQ (sin 6)d6/db db 

for 0 < 9 < # max = 7 r — arcsin £; for 0 > # max , da/dQ 
14.17 iV sc /Z 2 = 0.21,0.20, etc. 

14.31 (b) If # ]ab = 0, then 0 cm could be 0 or n. 


( . 1 ? 

\*JR 2 — b 2 y/R 2 - C 2 h 2 

= 0. (e) cr tot = tt/? 2 . 


A 


^lab(deg) 




(d) 0] ab (max) = arcsin (1/A,). 


Chapter 15 _ 

15.3 y — 1 + 3.56 x 10~ 10 ; difference = (A t 0 - At) = -1.28 ^s; percent difference = —3.56 x 

io- 8 %. 

15.5 A’s clocks read, and A has aged by, 25 years, compared with the 80 years by which B has 
aged. 

15.7 Number expected, N = 420, taking account of time dilation; but N = 29, ignoring time 
dilation. 

15.11 u = \c. 

15.13 (a) / = 91.7 cm; (b) l = 83.2 cm. 

15.17 (a) If we call the two events 1 and 2, then, as observed in S', t[ = 0 but t' 2 = —yfia/c. (b) As 
observed in S", t' x = 0 but t' 2 = +y/3a/c. 

15.19 (a) Xp = d, tp a= d/c, Xg = -d, = d/c. 

(b) x F = y (1 + P)d, % = K(1 + P)d/c, x B = -y(l - 0)d, t B = y( 1 - 0)d/c. 

15.21 v = 0.91c. 

15.23 The velocity of the right rocket relative to the left one is 0.994c to the left. 

15.25 (Speed as measured in S) = c. 

15.29 (a) 

cos# sin# 0" 

— sin # cos # 0 . 

0 0 1_ 


R(#) = 
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15.33 (a) 


x[= x x 1 


" 1 

0 

0 

0 " 

*2 = y( x 2- P x a) 1 

whence A B2 = 

0 

y 

0 

-Yfi 

*3 = *3 | 

0 

0 

1 

0 

*4 = y(x 4 - fix 2 ) J 


_0 

-Yfi 

0 

y _ 


(b) 



■ 0 1 

o 

o 


o 

1 

o 

o 


-1 0 

0 0 

and A r _ = 

10 0 0 

l R+ — 

o o 

o o 

1 0 

o i 

0 0 1 0 
_0 0 0 1 


15.47 v = 0.20c. 

15.55 Since u is a four-vector 

u\ = y(V)[iii - fi(V)u 4 ], u ' 2 = u 2 , u ' 3 = u 3 , u' 4 = y(y)[u 4 -P(V)u{\. 

15.57 T(Bi) + r(He) = 1.3 x 10~ 12 J = 8.1 MeV. 

15.61 E = 5 MeV; v = 0.8c. 

15.63 1 MeV/c 2 = 1.78 x 10~ 30 kg; 1 MeV/c = 5.34 x 1(T 22 kg-m/s. 

15.65 (b) v max = 0.12c. 

15.67 u f = 0;M = fm. 

15.71 (c) The minimum energies are E cm ~ 3100 MeV, but £ lab ~ 9.6 x 10 6 MeV. 

15.75 M = 2.95 GeV/c 2 ; v = 0.65c. 

15.81 r(rel) = 8.3 cm; r(nonrel) = 5.9 cm. 

15.93 If E denotes the final energy of the electron, E/E 0 ~ 0.002. 

15.109 (a) If we choose axes so that the two particles lie in the xy plane, then in their rest frame 
S', the force of one on the other is F' = (0, kq 2 /r a , 0). According to (15.155) this transforms to 
F = (0, kq 2 /yr 2 , 0) in S. (b) In S' the fields of one charge at the position of the other are E' = 
(0, kq/r* 2 , 0) and B' = (0, 0, 0). In S, they are E = (0, ykq/r 2 , 0) and B = (0, 0, yfikq 2 /cr 2 ); 
these produce a force F = ^vxB which is easily seen to be the same as in part (a). 



Answers for Chapter 16 


775 


Chapter 16 





Index 


All entries are identified by their page number. In addition, when a reference is to a whole section or chapter, 
I have indicated the section or chapter in parenthesis; for example, (Sec. 1.1) or (Ch. 1). Similarly, when a reference 
is primarily to a figure, example, problem, or footnote, I have added a parenthesis, such as (Fig. 1.2), (Ex. 1.3), 
(Pr.1.4), or just (Ft.). 


Absolute future, 626 
Absolute motion, non-existence 
of, 602 

Absolute past, 626 
Accelerating reference frame, 327 
(Sec.9.1) 

Acceleration, 
centripetal, 29 
Coriolis, 29 

Coriolis, and Coriolis force, 358 
(Sec.9.10) 
free-fall, 345-347 
in 2D polar coordinates, 29 
in Cartesian coordinates, 23 
Action integral, 239 
Addition, 

of angular velocities, 338 
of vectors, 6 
Air resistance, 43-65 
comparison of linear and 
quadratic, 45 

linear, 44, 46 (Secs.2.2-2.4) 
quadratic, 44, 57 (Secs.2.4-2.5), 
73 (Pr.2.4) 

Angular momentum, 90 (Secs.3.4- 
3.5) 

about CM, 97 
as orbital plus spin, 370 
conservation of, See 


Conservation of angular 
momentum 

in terms of CM and relative 
coordinates, 369 
in terms of Euler angles, 403 
L = Ico, 380 

L z = I z co, for rotation about z 
axis, 373 

not necessarily parallel to &>, 
374 

of several particles, 93-95 
of single particle, l , 90 
of two bodies in CM frame, 299 
total, L, 93-95 

Angular velocity, of earth’s spin, 
339 

Angular velocity vector, 336 
(Sec.9.3) 
addition of, 338 
Anisotropic oscillator, 171-172 
Aphelion, 309 
Apogee, 316 

Approach to chaos, for DDP, 467 
(Sec. 12.4) 

Attractor, 186 
DDP has several attractors, 
470-471 

for logistic map, 503 
strange, 497 


Atwood machine, 131-133 
by Lagrange’s equations, 255 
(Ex.7.3) 

double, 285 (Pr.7.27) 
energy of, 155 (Pr.4.31) 
including pulley, 156 (Pr.4.35) 
using Hamilton, 527 (Ex. 13.2) 
using Lagrange multiplier, 279 
(Ex.7.8) 

Autonomous equation, 460 (Ft.), 
537 (Ft.) 

Auxiliary equation, 175 

Axial symmetry, 378 

Backward light cone, 626 

Balance, inertial, 10 

Bam, 563 

Basin of attraction, 518 (Pr. 12.22) 

Bead, 

on rotating rod, 284 (Pr.7.21) 
on spinning hoop, 260 (Exs.7.6 
& 7.7) 

on spinning hoop, oscillations 
of, 264 (Ex.7.7) 
on wire, using Hamilton, 526 
(Ex.13.1) 

Bernoulli’s theorem, 725- 
726 

P, damping constant, 174 
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Bifurcation, 264 
of DDP, 474 
Bifurcation diagram, 
for DDP, 483-486 (Figs. 12.17- 
12.18) 

for logistic map, 510-513 
Block, 

on incline, 24 (Ex. 1.1), 115 
(Ex.4.3) 

sliding on wedge, 258 (Ex.7.5) 
BM = bulk modulus, 703 
for air, 737 (Pr. 16.36) 
in terms of Hooke’s law constant 
cr, 717 

Body cone, for free precession, 

399 

Body frame, 395 
Boost, 621 

Bottle in a bucket, 168 (Ex.5.2) 
Boundary conditions, 689 
Bounded orbits, 304, 309-312 
Brachistochrone, 222 (Ex.6.2), 

234 (Pr.6.21) 

is a cycloid, 224, 233 (Pr.6.14) 
isochronous property of, 234 
(Pr.6.25) 

Bulk modulus, See BM 

Calculus of variations, 215 
(Ch.6) 

definition of, 218 
with several variables, 226 
(Sec.6.4) 

Canonical momentum, 522 
Canonical transformation, 538 
examples of, 554 (Prs. 13.24- 
13.25) 

Carrying capacity, 501 
Cauchy’s reciprocal theorem, 735 
(Pr.16.19) 

Causality, 628 
Center of mass, See CM 
Central force, 18,91,133 (Sec.4.8) 
conservative implies spherical 
symmetry, 137, 158 (Pr.4.45) 
spherical symmetry implies 
conservative, 158 (Prs.4.43 & 
4.44) 

Central-force, two-body motion, 
See Two-body, central-force 
motion 

Centrifugal force, 262, 343, 344 
(Sec.9.6) 

contribution to g, 345-347 
Centripetal acceleration, 29 
Chandler wobble, 400 


Chaos, 457 (Ch.12) 
and sensitivity to initial 
conditions, for DDP, 479-483 
and sensitivity to initial 
conditions, for logistic map 
519 (Pr.12.29) 
criterion for, 460 (Ft.) 
for DDP, 476 

for logistic map, 511-512 
(Fig. 12.43) 

Characteristic equation, 389 
Characteristic time, r, 
for linear drag, 48, 52 
for quadratic drag, 59 
Charge in magnetic field, 65 
(Secs.2.5-2.7) 
helical motion of, 70-71 
Classical, 

definition of force, 11 
definition of mass, 10 
definition of momentum, 14 
mechanics, 3 
Cloud chamber, 560 
CM, center of mass, 87, 295, 367 
(Sec.10.1) 

acceleration related to external 
force, 88 

defined as integral, 88 
frame, See CM frame 
of earth & moon, 101 (Pr.3.17) 
of earth & sun, 101 (Pr.3.16) 
of solid cone, 89 (Ex.3.2) 
velocity related to total 
momentum, 88 
CM frame, 296 

for two-body motion, 297-298 
relativistic, 645 
Colatitude, 9, 134 
Collision, 

elastic, 99 (Pr.3.5), 142 
elastic, equal mass, 143 (Ex.4.8) 
elastic, relativistic, 646 
(Ex.15.10) 

of putty lump with turntable, 96 
(Ex.3.3) . 

of relativistic putty, 644 
(Ex. 15.9) 

perfectly inelastic, 84 (Ex.3.1), 
159 (Pr.4.48) 

unequal masses, elastic, 159 
(Pr.4.46) 

Collision theory, 557 (Ch.14) 
quantum, 557 

See also , Scattering, Cross 
section 

Complementary function, 181 (Ft.) 


Complete elliptic intergal, 157 
(Pr.4.38) 

Complete solution of ID 
motion,using energy, 127 
Complex exponentials, 68-69 
Complex numbers, 79 (Prs.2.45- 
2.51) 

used for charge in magnetic field, 
67-71 
Compton, 
effect, 655-656 
generator, 365 (Pr.9.31) 

Cone, 

CM of, 89 (Ex.3.2) 
inertia tensor of, 384 (Ex. 10.3) 
Configuration space, 522 
Conjugate momentum, 522 
Conservation, 

of angular momentum, 91, 96, 
299 

of energy, 114 
of energy for multiparticle 
system, 146 

of energy for two particles, 
140-142 

of energy in Lagrangian 
mechanics, 269-272 
of momentum, 18, 21, 83 
Conservation laws, in Lagrangian 
mechanics, 268 (Sec.7.8) 
Conservative force, 109-111 
central implies spherical 
symmetry, 137, 158 (Pr.4.45) 
Coulomb force as example, 119 
(Ex.4.5) 
defined, 111 

second condition for, 118-119 
Constitutive equation, 715 
Constrained systems, 245, 249 
Constraint equation, 275 
Constraint force, 251 
eliminated in Lagrangian 
approach, 237 

related to Lagrange multiplier, 
278 

Continuity equation, 726 
Continuum hypothesis, 682 
Continuum mechanics, 681 
(Ch.16) 

Core (outer) of earth is liquid, 722 
Coriolis acceleration, 29 
and Coriolis force, 358 
(Sec.9.10) 

Coriolis force, 343, 348 (Sec.9.7) 
and Coriolis acceleration, 358 
(Sec.9.10) 
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compared to magnetic force, 348 
effect on free fall, 351 (Sec.9.8) 
Coulomb force is conservative, 

119 (Ex.4.5) 

Coupled differential equations, 47 
Coupled oscillators, 417 (Ch.ll) 
damped, 449 (Pr. 11.10) 
damped and driven, 449 
(Pr. 11.11) 

matrix equation of motion for, 
418^-19, 439 

weak coupling, 426 (Sec. 11.3) 
with n degrees of freedom, 436 
(Sec. 11.5) 

Coupled pendulums, 441 
(Sec. 11.6) 

Covariant & contravariant vectors, 
656 

Critical damping, 177-178 
as limit of weak damping, 
210-211 (Prs.5.24& 5.32) 
Cross product, 7 
Cross section, 560 (Secs. 14.2- 
14.3) 

capture, 566 
definition of, 562 
differential, See Differential 
cross section 
elastic & inelastic, 567 
fission, 567 
in various frames, 579 
ionization, 567 
ionization, is zero below 
ionization energy, 568 
scattering, 566 
total, 568 

Crows in oak tree, 562 (Ex. 14.1) 
Cube, balanced on cylinder, 130 
(Ex.4.7) 

PE near equilibrium, 162 
(Ex.5.1) 

Curl, of vector, 119, 152-153 
(Prs.4.22 & 4.25) 

Curvilinear ID systems, 129 
(Sec.4.7) 

Cyclic coordinates, 266 
See also Ignorable coordinates 
Cycloid, 224, 233 (Pr.6.14) 

See also Brachistochrone 
Cyclone, 350 

Cyclotron frequency, 66, 71 
Cylinder on incline, 147 (Ex.4.9) 
Cylindrical polar coordinates, 40 
(Pr. 1.47) 

Damped oscillations, 173 (Sec.5.4) 


Damping constant fi, 174 
of DDP, 463 
DDP, 462 (Sec. 12.2) 
approach to chaos, 467 
(Sec. 12.4) 

expected properties of, 463 
(Sec. 12.3) 

rolling motion of, 481- 
482 (Fig.12.15), 486-487 
(Fig. 12.19) 

Decay parameter, 176, 177, 178 
Decomposition of strain tensor, 
714 

Degrees of freedom, 249 
V, del, 117 

as differential operator, 118 
V 2 = Laplacian, 694-695 
Derivatives matrix, D, 710 
for a small rotation, 711 
Deviatoric part E' of strain tensor, 
714 

Diagonal matrix, 386 
Diagonalization of matrices, 739 
(Appendix) 

of a single matrix, 739 
(Sec.A.l) 

of inertia tensor, 392 
of two matrices, 743 (Sec.A.2) 
Diatomic molecule, PE of, 
126-127 

Differential cross section, 568 
(Secs. 14.4-14.5) 
calculation of, 572 (Sec. 14.5) 
definition of, 570-571 
for hard sphere scattering, 573 
(Ex. 14.5) 

for Rutherford scattering, 576 
in lab, related to CM, 582, 585 
in various frames, 579 
Differential equations, 14 
coupled, 47 
general solution of, 32 
Differential operator, 180 
Differentiation, 
of vectors, 7 

partial, 116-117, 152 (Prs.4.10 
& 4.11) 

Dilatation, 712 (Ex. 16.5) 

Discrete time, 499 
Divergence, V • v, 546 
as outward flow/volume, 548 
in n dimensions, 548 
Divergence theorem, 546 
proof of, 556 (Pr. 13.37) 
Doppler effect, 631-632 
transverse, 672 (Pr. 15.48) 


Dot product, 6 

equivalence of two definitions 
of, 35 (Pr. 1.7) 

Double map, /(/(x)), 507 
Double pendulum, 430 (Sec. 11.4) 
Drag, See Air resistance 
Drive frequency, co, 
of DDP, 462 

vs. natural frequency oo Q , 182 
Drive strength, y, of DDP, 463 
Driven damped oscillations 
(linear), 179 (Secs.5.5-5.6) 
complex form of, 182 
Fourier series for, 197 (Sec.5.8) 
Driven damped pendulum, See 
DDP 

Dumbbell, sliding & spinning, 97 
(Ex. 3.4) 

Dynamical balancing, of car 
wheels, 375 

e h e 2 , e 3 , unit vectors, 5 
Eccentricity, e, of Kepler orbits, 
311 

related to energy, 313 
Effective PE, for central force, 301 
Eigenvalue, 389 
of fixed point, 505 
Eigenvalue equation, 389 
generalized, for coupled 
oscillators, 420 
Eigenvector, 389 
Elastic collision, 99 (Pr.3.5), 142 
energy lost in lab frame, 592 
(Pr. 14.29) 

equal mass, 143 (Ex.4.8) 
relativistic, 646 (Ex. 15.10) 
unequal masses, 159 (Pr.4.46) 
Elastic modulus = stress/strain, 
703-704 

Electric and magnetic forces, 
relative strengths of, 38 
(Pr. 1.32) 

Electric field, of charge with 
constant velocity, 680 
(Pr.15.110) 

Electrodynamics, relativistic, 660 
(Sec. 15.18) 

Electromagnetic, 
four-vector current density, 680 
(Pr.15.108) 

four-vector potential, 680 
(Pr.15.107) 
momentum, 22 
Electromagnetic field, 

Lorentz transformation of, 661 
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Electromagnetic field (continued) 
of moving line charge, 662 
(Ex. 15.12) 
tensor, 661 

Elliptic integral, 157 (Pr.4.38) 
Elliptical orbits, of planets, 309 
Elsewhere, 628 
Energy, 105 (Ch.4) 
conservation of, See 
Conservation of energy 
kinetic, See Kinetic energy 
mass, 641 
mechanical, 113 
of ID linear systems, 123 
(Sec.4.6) 

of comet, related to eccentricity, 
313 

of multiparticle system, 144 
(Sec.4.10) 
ofSHM, 169 
of two particles, 138-142 
potential, See Potential energy 
relativistic, 638 
rest, 641 
threshold, 647 
Equation of continuity, 726 
Equation of motion, 23 
for elastic solid, 721 
for inviscid fluid, 725 
See also Newton’s second law 
Equilibrium of ID system, when 
dU/dx — 0, 125 
Ether drag, 600 
Euler angles, 401 (Sec. 10.9) 
Eulerian description of fluid, 723 
(Ft.) 

Euler-Lagrange equation, 220 
with two dependent variables, 
227 

Euler’s equations, 394 (Secs. 10.7- 

10 . 8 ) 

with zero torque, 397 (Sec. 10.8) 
Euler’s formula, 68-69 
Even function, 198 

Feigenbaum relation, for DDP, 
474-475 

for logistic map, 519 (Pr. 12.29) 
Fermat’s principle, 217 
and law of reflection, 231 
(Pr.6.3) 

and Snell’s law, 231 (Pr.6.4) 
Fictitious force, 329 
First integral, of Euler-Lagrange 
equation, 232 (Prs.6.10 & 
6 . 20 ) 


Fixed point, 503 
eigenvalue of, 505 
multiplier of, 505 
stable, 505 
Force, 

as gradient of PE, 116-117 
central, See Central force 
centrifugal, See Centrifugal force 
conservative, See Conservative 
force 

constraint, See Constraint force 
Coriolis, See Coriolis force 
definition of, 11 
derivable from PE, 117 
fictitious, 329 
heat-like, 649 
inertial, 328 
nonconservative, 114 
on particle a, F a = —V a U, 146 
relativistic, 649 (Sec.15.15) 
spherically symmetric, 134 
surface, 698-699 
tidal, 332 
volume, 698 

Force constant, y, for Kepler 
problem, 308 

Forced coordinates, 249 (Ft.) 
Forward light cone, 625 
Foucault pendulum, 354 (Sec.9.9) 
Four-force, 652 

Fourier coefficients, integrals for, 
195 

Fourier series, 192 (Secs.5.7-5.9) 
definition of, 194 
for driven oscillator, 197 
(Sec.5.8) 

for rectangular pulse, 195 
(Ex.5.4) 

Fourier sine series, 692, 734 
(Pr. 16,131 

Fourier’s theorem, 194 
Four-momentum, 636 
Four-scalar, 622 
Four-space, 620 
Four-vector, 620 
current density, 680 (Pr.15.108) 
definition of, 622 
electromagnetic potential, 680 
(Pr.15.107) 

Four-velocity, 634 
Fractal, 496 

Frame of reference, See Reference 
frame 
Free fall, 

acceleration, 345-347 

and Coriolis force, 351 (Sec.9.8) 


using energy, 128 (Ex.4.6) 

Free precession, of axially 
symmetric body, 398-400 
Frequency, 
drive, of DDP, 462 
driving a> vs. natural od 0 , 182 
natural, &> 0 , of DDP, 463 
Frisbee, precession of, 414 
(Pr. 10.43) 

Full width at half maximum, 189 
Fundamental mode, 690 
FWHM, 189 

g, contribution of centrifugal force 
to,345-347 

Galilean invariance, of Newton’s 
laws, 598 

Galilean relativity, 596 (Sec.15.2) 
Galilean transformation, 597 
y = drive strength, of DDP, 463 
y = force constant, for Kepler 
problem, 308 

y = yi - 606 

T = torque, 90 

Gauss’s theorem. See Divergence 
theorem 

Gedanken experiment = thought 
experiment, 603 
Geiger & Marsden, 577 
data, 590 (Prs.14.16-14.17) 
General solution,of second-order 
differential equation, 32 
Generalized coordinates, 240, 
247-249 
forced, 249 (Ft.) 
natural, 249 
Generalized force, 241 
4> component = torque, 244 
Generalized Hooke’s law, 716 
Generalized momentum, 241, 266 
(p component = angular 
momentum, 244 
Geodesic, 

on cone, 233 (Pr.6.17) 
on cylinder, 232 (Pr.6.7) 
on sphere, 225, 233 (Pr.6.16) 
Gibbs phenomenon, 197 (Ft.) 
g a = “true” acceleration of gravity, 
346 

GPS, importance of time dilation 
for, 608 

Gradient, V, 117, 152 (Prs.4.12- 
4.15 & 4.18) 

in spherical polar coordinates, 
136-137 

Graviton, 653 (Ft.) 
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Growth equation, 499-500 
Half-life, 607 

Half-pipe and skateboard, 30 
Half width at half maximum, 190 
Halley’s comet, 311 
Hamiltonian, ‘K, 270, 521 (Ch.13) 
as energy for natural system, 
270-272, 525 
definition of, 523', 529 
for charge in magnetic field, 553 
(Pr. 13.18) 

not equal to energy for 
non-natural systems, 552 
(Prs.13.11 & 13.12) 
Hamilton’s equations, 521 
(Ch.13) 

compared with Lagrange’s, 536 
(Sec. 13.5) 

derivation of, in ID, 525-526 
for ID oscillator, 539 (Ex. 13.5) 
for ID systems, 524 (Sec.13.2) 
for central force, 531 (Ex. 13.3) 
for free fall, 541 (Ex.13.6) 
for mass on cone, 533 (Ex. 13.4) 
in several dimensions, 528 
(Sec. 13.3) 

See also Hamiltonian, “K 
Hamilton’s principle, 239 
Hard-sphere scattering, 573 
(Ex. 14.5) 

lab & CM cross sections for, 585 
(Ex. 14.7) 

Harmonic, 466 
of finite string, 691 
Heat-like force, 649 
Hobos on flatcar, 99 (Pr.3.4) 
Holonomic systems, 249 
Homogeneous equation, 180 
Homogeneous solution, 181 
Hooke’s law, 161 (Sec.5.1) 
generalized, for solid, 716 
Horizontal, definition of, 347 
Hurricane, 350 
HWHM, 190 

Hyperbolic functions, 61, 77 
(Prs.2.33-2.34) 

Hyperbolic Kepler orbits, 314-315 

Ideal fluid, 723 (Secs.16.12-16.13) 
Ignorable coordinates, 266 
in Hamiltonian mechanics, 535 
(Sec. 13.4) 

Impact parameter, b, 558-560 
Incline, 

and rolling cylinder, 147 (Ex.4.9) 


block on, 24 (Ex.1.1), 115 
(Ex.4.3) 

Independent functions, 174 (Ft.) 
Inelastic collision, 84 (Ex.3.1), 

159 (Pr.4.48) 

Inertia tensor, 378 (Sec. 10.3) 
definition of, 380 
diagonalization of, 392 
for lamina, 411 (Pr.10.23) 
for solid cone, 384 (Ex. 10.3) 
for solid cube, 381 (Ex. 10.2) 
symmetry of, 381 
Inertial balance, 10 
Inertial force, 328 
Inertial frame, 9, 15, 601 
Inhomogeneous equation, 180 
Integral, line, 107 
Invariance, 

of Newton’s laws under Galilean 
transformation, 598 
rotational, 134 
translational, 139 
Invariant, 
mass, 633 

length squared, x -x, 624 
scalar product, 623 (Sec.15.9) 
scalar product, definition in 4D, 
624 

Inverse Lorentz transformation, 
612 

Inviscid fluid, 723 (Secs.16.12- 
16.13) 

Irreducible subspaces, 716 (Ft.) 
Isotropic oscillator, 170-171 
Isotropy of shear-free pressure, 
699 

Iterated map, 501 

KE, See Kinetic energy 
Kepler orbits, 308 (Secs.8.6-8.7) 
are elliptical, 309 
changes of, 315 (Sec.8.8) 
eccentricity of, 311, 313 
hyperbolic, 314-315 
parabolic, 314 
See also Bounded orbits, 
Unbounded orbits 
Kepler problem, 301 
See also Two-body, central-force 
motion 

Kepler’s first law, 311 
Kepler’s second law, 91-93 
Kepler’s third law, 311-312 
Kinetic energy, 105 
of rotation about a fixed axis, 
374 


of rotation about any axis, 388, 
412 (Pr. 10.33) 
relativistic, 641 
rotational, in terms of Euler 
angles, 403 

total, as orbital plus spin, 371 
Kronecker delta symbol, 5 fy -, 411 
(Pr.10.21) 

l = angular momentum, 90 
L = total angular momentum, 
93-95 

Lagrange multipliers, 275 
(Sec.7.10) 

related to constraint forces, 278 
Lagrange’s equations, 237 (Ch.7) 
and conservation laws, 268 
(Sec.7.8) 

compared with Hamilton’s, 536 
(Sec. 13.5) 

for magnetic forces, 272 
(Sec.7.9) 

for unconstrained motion, 238 
(Sec.7.1) 
modified, 277 

with constraints, 250 (Sec.7.4) 
with Lagrange multipliers, 277 
See also Lagrangian 
Lagrangian description of fluid, 
723 (Ft.) 

Lagrangian, £, 
for charge in magnetic field, 
273-274 

general definition of, 272 
£ = T - U, 238 
nonuniqueness of, 272-273 
of top, in terms of Euler angles, 
403 

See also Lagrange’s equations 
k = mass ratio, 584 
A. = thrust factor, 317 
Lame constants, 716 (Ft.) 

Lamina, 547 (Ft.) 
inertia tensor of, 411 (Pr.10.23) 
principal axis of, 412 (Pr. 10.30) 
Laminar flow, 547 
Laplacian, V 2 , 694-695 
Larmor precession, 363 (Pr.9.22) 
Latitude, 134 
Law of inertia, 13 
See also Newton’s first law 
Law of reflection, in hard-sphere 
scattering, 589 (Pr.14.13) 
Legendre transformation, 524 (Ft.) 
Length contraction, 608 (Sec.15.5) 
cannot be seen, 668 (Pr.15.14) 
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Length contraction (continued) 
formula, 609 
Liapunov exponent, 480 
Lift, 43 

Light cone, 625 (Sec. 15.10) 
forward & backward, 625-626 
Line integral, 107 
Linear drag, 44, 46 (Secs.2.2-2.4) 
compared to quadratic, 45 
horizontal motion with, 48 
trajectory of projectile with, 54 
vertical motion with, 49 
Linear operator, 180 
Linearity & nonlinearity, 458 
(Sec. 12.1) 

Liouville’s theorem, 543 (Sec. 13.7) 
proof of, 548-549 
Lissajous figure, 172 
Logistic map, 498 (Sec. 12.9) 
bifurcation diagram for, 510— 
513 

chaos, 511-512 (Fig. 12.43) 
defined, 502 

Feigenbaum relation for, 519 
(Pr. 12.29) 

sensitivity to initial conditions, 
519 (Pr. 12.30) 

Longitude, <j>, 134 
Longitudinal wave, 
in solid, 721 
in string, 735 (Pr.16.17) 
Lorentz-Fitzgerald contraction, 
610 

See also Length contraction 
Lorentz force, 660 
Lorentz scalar, 622 
Lorentz transformation, 610 
(Sec.15.6) 
equations, 612 
inverse, 612 

of electromagnetic fields, 661 
standard, 621 
LRC circuit, 173 
driven, 179 

Magnetic and electric forces, 
relative strengths of, 38 
(Pr.1.32) 

Magnetic field, charge in, 65 
(Secs.2.5-2.7) 

Magnetic force, 
and Lagrange’s equations, 272 
(Sec.7.9) 

between two current loops, 38 
(Pr.1.33) 

violates Newton’s third law, 22 


Map, 500 

double, /(/(x)), 507 
iterated, 501 

logistic, See Logistic map 
second iterate, /(/(x)), 507 
sine, 518 (Prs. 12.23-12.25) 
Mass, 

change of, in Franck-Hertz 
experiment, 640 (Ex.15.7) 
classical definition of, 10 
energy, 641 
invariant, 633 

is proportional to weight, 10 
matrix, 419, 440 
nonconservation of, in relativity, 
639-640 
ratio, X, 584 

relativistic definition of, 633 
variable, 633, 637 
Massless particles, 652 (Sec. 15.16) 
Material derivative, 724 
Material description of fluid, 

723 
Matrix, 
diagonal, 386 
for standard boost, 621 
multiplication, 380 
orthogonal, 657 
positive definite, 743 
rotation (3D), 618 
rotation (4D), 621 
trace of, 714 
unit, 1, 384 

Maxwell’s equations, in four- 
vector form, 680 (Pr. 15.111) 
Mean free path, of air molecules, 
564 (Ex.14.3) 

Mechanical energy, 113 
Mechanics, 
classical, 3 

continuum, 681 (Ch.16) 
Hamiltonian, See Hamilton’s 
equations 

Lagrangian, See Lagrange’s 
equations 

nonlinear, 457 (Ch.12) 
quantum, 3 

Method of images, 733 (Pr.16.12) 
Metric matrix, G, 659 
MeV/c, 643 
MeV/c 2 , 643 

Michelson-Morley experiment, 
600 

Minkowski, 617 
Moment of inertia, 95 
4, 373-374 


Moment of inertia tensor, See 
Inertia tensor 
Momentum, 

angular, See Angular momentum 
canonical, 522 
classical definition of, 14 
conjugate, 522 
conservation of, See 

Conservation of momentum 
electromagnetic, 22 
generalized, See Generalized 
momentum 

of photon, related to wave vector, 
654 

relativistic, 636 

total, in terms of CM velocity, 88 
Momentum-energy, 638 
See also Four-momentum 
Morse potential energy, 207 
(Pr.5.2) 

ix = reduced mass, 296 
Multiplication, of matrices, 380 
Multiplier, 

Lagrange, 275 (Sec.7.10) 
of fixed point, 505 
Multistage rockets, 101 (Pr.3.12) 

n, unit vector normal to surface, 
698 

Natural coordinates, 249 
Natural frequency, <y 0 , 174 
of DDP, 463 

vs. driving frequency, co, 182 
Natural units, 442 
Navier equation, 721 
Neap tides, 335 
Neutrino, 653 (Ft.) 

Newton’s first law, 13 
validity of, 17 
Newton’s laws, 3 (Ch.l) 

Newton’s second law, 13 
in 2D polar coordinates, 29 
in Cartesian coordinates, 23 
(Sec, 1.6) 

in rotating frame, 342 (Sec.9.5) 
in rotating frame, using 
Lagrange, 361 (Pr.9.11) 
rotational form of, 90 
validity of, 17 

Newton’s third law, 17 (Sec. 1.5) 
and conservation of momentum, 
18 

invalid in relativity, 21 
validity of, 21 

violated by magnetic forces, 22 



Index 


783 


Noether’s theorem, 267, 272 
and angular momentum, 290 
(Pr.7.46) 

Nonconservative force, 114 
Nonholonomic systems, 249-250 
Noninertial frame, 15, 327 (Ch.9) 
Nonlinear mechanics, 457 (Ch.12) 
Nonlinearity, 458 (Sec.12.1) 
Normal coordinates, 425, 444 
(Sec. 11.7), 454 (Prs. 11.33— 
11.35) 

Normal frequencies, 421 
Normal modes, 417 (Ch.ll) 
defined, 422 
of finite string, 689-691 
Numerical solution, for trajectory 
of baseball, 63 (Ex.2.6) 
Nutation, of top, 405 

o) 0 = natural frequency, 174, 463 
One-dimensional systems, 
graphs of PE for, 124-126 
energy of, 123 (Secs.4.6-4.7) 
Operator, linear, 180 
Orthogonal matrix, 657 
Oscillations, 161 (Ch.5) 
coupled, See Coupled oscillators 
damped, 173 (Sec.5.4) 
driven by rectangular pulse, 199 
(Ex.5.5) 

driven damped (linear), 179 
(Secs.5.5-5.6) 

driven damped, complex form 
of, 182 

in two dimensions, 170 (Sec.5.3) 
of bead on spinning hoop, 264 
(Ex.7.7) 

overdamped, 177 
underdamped, 176-177 
Overdamped oscillations, 177 
Overtones, 690 

Parabolic Kepler orbits, 314 
Parallel-axis theorem, generalized, 
411 (Pr. 10.24) 

Parseval’s theorem, 204 
Partial derivative, 116-117, 152 
(Prs.4.10 & 4.11) 

Partial fractions, 78 (Pr.2.37) 
Partial-wave series, 589 (Pr.14.12) 
Particle, 13 

Particular solution, 181 
PE, See Potential energy 
Pendulum, 

double, 430 (Sec. 11.4) 
driven damped, See DDP 


Foucault, 354 (Sec.9.9) 
in an accelerating car, 329 
(Ex.9.1) 

simple, See Simple pendulum 
spherical, 288 (Pr.7.40) 

Perigee, 316 
Perihelion, 309 
Period-doubling cascade, 
for logistic map, 510 
in convection of mercury, 
473-474 (Fig. 12.9) 
of DDP, 471—475 
Period three, of DDP, 469 
Period two, 
of DDP, 468 
of logistic map, 507 
Periodic function, definition of, 
193 

Phase-locked motion, 517 
(Pr.12.17) 

Phase point, z, 537 
Phase shift, 
near resonance, 191 
scattering, 588 (Pr.14.12) 

Phase space, 523 
Phase-space orbit, 538 (Sec. 13.6) 
for ID oscillator, 539 (Ex.13.5) 
for free fall, 541 (Ex. 13.6) 
Phase-space vector, z, 537 
<j>, unit vector, 
derivative of, 28 

in cylindrical polars, 40 (Prs. 1.47 
& 1.48) 

in spherical polars, 135-136 
in 2D, 26 

Photon, 652 (Sec.15.16) 
relation between p & k, 654 
Pion, decay of, 667 (Pr 15.8) 
Planck’s constant, 588 (Pr.14.12), 
654 

Plane wave, 695 
Pluto, discovery of, 595 (Ft.) 
Poincare, 459 
section, 495 (Sec. 12.8) 

Point mass, 13 
Polar coordinates, 
cylindrical, 40 (Pr.1.47) 
in 2D, 26 

spherical, 134-135 
Positive definite matrix, 743 
Potential energy, 
defined, 111 

effective, for two-body motion, 
301 

in uniform gravitational field, 
151 (Prs.4.5 & 4.6) 


internal, of rigid body, 147 
kx A , 154 (Pr.4.29) 
of ID linear systems, 124—126 
of charge in electric field, 112 
(Ex.4.2) 

of diatomic molecule, 126-127 
of many particles, 145, 146 
of simple pendulum, 155 
(Pr.4.34) 

of spring, 151 (Pr.4.9) 
of two charges, 121 
time-dependent, 121, 154 
(Pr.4.27) 

Precession, 
free, 398-400 

free, using Euler angles, 415 
(Pr. 10.55) 
of equinoxes, 394 
of frisbee, 414 (Pr. 10.43) 
of top due to weak torque, 392 
(Sec. 10.6) 

of top, using Euler angles, 404 
Preferred frame, non-existence of, 
602 

Pressure is isotropic if no shearing 
forces, 699 
Primary, P, wave, 722 
Principal axes of inertia, 387 
(Sec. 10.4) 
existence of, 388 
finding, 389 (Sec. 10.5) 
for cube about comer, 390 
(Ex. 10.4) 

of lamina, 412 (Pr. 10.30) 
Principal moments (of inertia), 387 
Principal stress axes, 735 
(Pr.16.21) 

Product, 

cross, of two vectors, 7 
dot, of two vectors, 6, 35 (Pr.1.7) 
of inertia, 375-376 
of vector and scalar, 6 
Proper length, 610 
Proper time, 607 
of an object, 634 

Q factor, 190 

Quadratic drag, 44, 57 (Secs.2.4- 
2.5), 73 (Pr.2.4) 
compared to linear, 45 
horizontal and vertical motion 
with, 62 

horizontal motion with, 58 
trajectory of baseball with, 63 
(Ex.2.6) 

vertical motion with, 60 
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Quantum collision theory, 557 
Quantum mechanics, 3 
Quasiperiodic motion, 172 
Quotient rule, 630 


derivative of, 27 
unit vector of 2D polars, 26 
unit vector, of spherical polars, 
135-136 
Radial equation, 
for two-body motion, 299-300 
transformed, 306 
Range, 

of baseball with quadratic drag, 
63 (Ex.2.6) 

of projectile with linear drag, 54 
Rapidity, 670 (Prs.15.30-15.31) 
Rectangular pulse, 
driving an oscillator, 199 
(Ex.5.5) 

Fourier series for, 195 (Ex.5.4) 
Reduced mass, /x, 296 
Reference frame, 9 
accelerating, 327 (Sec.9.1) 
body, 395 

inertial, See Inertial frame 
noninertial, 15, 327 (Ch.9) 
rotating, See Rotating reference 
frame 
space, 395 

Reflection symmetry, 377 
Relative position, r, 294 
Relativistic electrodynamics, 660 
(Sec. 15.18) 

Relativistic snake, 613 (Ex. 15.3) 
Relativity, 596 
Galilean, 596 (Sec. 15.2) 
general, 596 

of simultaneity, 615, 669 
(Pr.15.19) 

of time. See Time dilation 
special, See Special relativity 
Resonance, 187 (Sec.5.6) 
on washboard road, 188, 212 
(Pr.5.43) 

phase shift near, 191 
width of, 189 
Rest energy, 641 
Rest frame, 610 

Reynolds number, 46, 72 (Pr.2.3) 
Rheonomous coordinates, 249 
(Ft.) 

p - distance from z axis, 40 
(Pr.1.47) 


p, unit vector, 40 (Prs.1.47 and 
1.48) 

Rigid body, 147 
rotation of, 367 (Ch.10) 

RLC circuit, 173 
driven, 179 

RMS, displacement, 203 
for driven oscillator, 204 (Ex.5.6) 
Rockets, 85 (Sec.3.2) 
multistage, 101 (Pr.3.12) 

Saturn V, 100 (Pr.3.6) 

space shuttle, 100 (Prs.3.7 & 3.9) 

thrust of, 86 

Rolling, of DDP, 481-482 
(Fig. 12.15), 486-487 
(Fig. 12.19) 

Root mean square. See RMS 
Rotating reference frame, 339 
(Secs.9.4-9.10) 

Newton’s second law in, 342 
(Sec.9.5) 

time derivatives in, 339 (Sec.9.4) 
Rotation, 367 (Ch.10) 
about a fixed axis, 372 (Sec. 10.2) 
about any axis, 378 (Sec. 10.3) 
Rotation matrix, 
four-dimensional, 621 
three-dimensional, 618 
Rotational scalar, 622 
Rotationally invariant force, 134 
Rutherford formula, 576 
Rutherford scattering, 557, 574 
(Sec. 14.6) 

angular dependence of, 577 
(Ex. 14.6) 

Saturn V rocket, 100 (Pr.3.6) 
Scalar, 

Lorentz, 622 
rotational, 622 
Scalar product, 6 
equivalence of two definitions 
of, 35 (Pr.1.7) 

Scattering, 

amplitude (quantum), 588 
(Pr.14.12) 

angle, in lab related to CM, 584 
angle, 0, 558 
elastic & inelastic, 567 
of neutrons off aluminum, 563 
(Ex. 14.2) 

of two hard spheres, 564 
Rutherford, See Rutherford 
scattering 

Sehur’s lemma, 716 (Ft.) 


Scleronomous coordinates, 249 
(Ft.) 

Secondary, S, wave, 722 
Secular equation, 389 
Self similarity, 
of fractal, 497 

of logistic bifurcation diagram, 
512 

Sensitivity to initial conditions, 
for DDP, 480 (Fig. 12.13) 
for logistic map, 519 (Pr. 12.30) 
Separation of variables, 
for first-order differential 
equation, 58, 60 (Ft.), 73 
(Pr.2.7) 

for wave equation, 733 (Pr.16.9) 
Shear modulus, See SM 
Shear wave, in solid, 721-722 
Shearing flow, 547 (Ex. 13.7) 

SHM, See Simple harmonic 
motion 

Shortest path between two points, 
221 (Ex.6.1), 228 (Ex.6.3) 
in three dimensions, 235 
(Pr.6.27) 

Simple harmonic motion, 163 
(Sec.5.2) 

as real part of complex 
exponential, 167 
definition of, 165 
energy of, 169 
Simple pendulum, 
exact period of, 156 (Pr.4.38) 

PE of, 155 (Pr.4.34) 
second approximation for period, 
157 (Pr.4.39) 

Simultaneity, 615, 669 (Pr.15.19) 
Simultaneous diagonalization, of 
two matrices, 743 (Sec.A.2) 
Sine map, 518 (Prs. 12.23-12.25) 
Skateboard in half-pipe, 30 
SM = shear modulus, 703 
in terms of Hooke’s law constant 
111 

Snake, relativistic, 613 (Ex. 15.3) 
Snell’s law and Fermat’s principle, 
231 (Pr.6.4) 

Soap-bubble problem, 233 
(Pr.6.19) 

Solid angle, 569 
Space cone, for free precession, 
400 

Space frame, 395 

Space shuttle, 100 (Prs.3.7 & 3.9) 

Space-like vector, 628 



Index 


785 


Space-time, 617 (Sec.15.8), 620 
Spatial description of fluid, 723 
Special relativity, 595 (Ch.15) 
postulates of, 601 
Speed, 

of longitudinal wave in solid, 

721 

of longitudinal wave in string, 
735 (Pr 16.17) 

of sound in air, 737 (Pr. 16.36) 
of sound in fluid, 729 
of sound in water, 729 (Ex. 16.9) 
of transverse wave in solid, 722 
of transverse waves on string, 
684 

Speed of light, 
and Michelson-Morley 
experiment, 600 
as speed limit for causal 
influences, 628 

as speed limit for inertial frames, 
606 

as speed limit for material 
particles, 629 

invariance of, 672 (Pr. 15.43) 
non-invariance under Galilean 
transformation, 599 
Spherical part el of strain tensor, 
714 

Spherical pendulum, 288 (Pr.7.40) 
Spherical polar coordinates, 
134-135 

gradient in, 136-137 
Spherical strain, 712 (Ex.16.5) 
Spherical wave, 696-697 
Spherically symmetric force, 134 
central implies conservative, 158 
(Prs.4.43-4.44) 

Spinning hoop with bead, 260 
(Exs.7.6 & 7.7) 

Spring tides, 335 
Spring-constant matrix, 419, 440 
sr, steradian, 570 
Stability, 

of freely spinning body, 398 
of fixed points, 505 
Stable equilibrium, when 
d 2 U/dx 2 > 0, 125 
Standard boost, 621 
Standard configuration, 598 
Standing wave, 688 
State (or state of motion), 490 
State space, 490, 522, 

State-space orbit, 487 (Sec. 12.7) 
defined, 490 


Stationary path, 217-218 
Stationary point, 217 
Steradian, sr, 570 
Stokes’s law, 72 (Pr.2.2) 

Strain = fractional deformation, 
702-703 
Strain tensor, E, 
definition of, 711 
decomposition of, 714 
for a solid, 709 (Sec. 16.8) 
for dilatation, 712 (Ex. 16.5) 
for shear, 713 (Ex. 16.6) 
stretching elements, e i7 , 713 
Strange attractor, 497 
Stress & strain tensors, relation 
between, 715 (Sec. 16.9) 
Stress = force/area, 702-703 
Stress tensor, E, 704 (Sec. 16.7) 
definition of, 706 
in a static fluid, 708 (Ex. 16.3) 
is symmetric, 707 
Stretching elements, e n , of strain 
tensor, 713 
String, 

longitudinal motion of, 735 
(Pr.16.17) 

transverse motion of, 682 
(Sec.16.1) 

Strong damping, 177 
Subharmonic, 469 
Sum, of vectors, 6 
Superposition principle, 164 
not true for nonlinear equations, 
461 

Surface force, 698-699 
Sweet spot, 410 (Pr.10.18) 
Symmetry, 
axial, 378 

of inertia tensor, 381 
of stress tensor, 707 
under reflection, 377 

Target density, n tar , 561 
r = characteristic time, 
for linear drag, 48, 52 
for quadratic drag, 59 
Taylor’s series, 75 (Pr.2.18) 
Tensor, 656 (Sec.15.17) 
electromagnetic, 661 
four-dimensional, 659 
inertia, See Inertia tensor 
three-dimensional, 657-658 
Terminal speed, 
of baseball, 61 (Ex.2.5) 
with linear drag, 50 


with quadratic drag, 60 
0, unit vector, of spherical polar 
coordinates, 135-136 
Three-force, 650 

Three-momentum, relativistic, 636 
Three-scalar, 622 
Threshold energy, 647 
Thrust, of rocket, 86 
Thrust factor, X, 317 
Tidal force, 332 
Tides, 330 (Sec.9.2) 
neap, 335 
spring, 335 
Time, 

classical view of, 9 
discrete, 499 
horizon, 517 (Pr.12.16) 
in relativity, 603 (Sec. 15.4) 
proper, 607 

proper, of an object, 634 
Time-dependent PE, 121, 154 
(Pr.4.27) 

Time derivatives, in rotating 
frames, 339 (Sec.9.4) 

Time dilation, 603 (Sec. 15.4) 
cannot be seen, 668 (Pr.15.10) 
evidence for, 607-608 
for jet plane, 605 (Ex.15.1) 
formula, 606 
Time-like vector, 629 
Top, 

motion of, using Euler angles, 
403 (Sec.10.10) 
nutation of, 405 
precession of, due to weak 
torque, 392 (Sec. 10.6) 
precession of, using Euler 
angles, 404 
Torque, T, 90 
Trace of a matrix, 714 
Transformed radial equation, for 
two-body motion, 306 
Transients, 184 
Translational invariance, 139 
Transpose A of matrix A, 381, 658 
Transverse wave, 
in solid, 721-722 
on string, 682 (Secs.16.1-16.3) 
Triangle inequality, 36 (Pr.1.14) 
Turning point, 125 
for radial motion of comet, 303 
Twin paradox, 667 (Pr.15.5) 
Two-body, central-force motion, 
293 (Ch.8) 

closed and unclosed orbits, 304 
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Two-body, central-force 
motion (continued) 
effective PE for, 301 
equation for relative motion, 297 
equivalent ID problem, 300 
(Sec.8.4) 

Lagrangian for, 297 
radial equation for, 299-300 
relative motion lies in fixed 
plane, 299 

transformed radial equation, 306 
Two-cycle, for logistic map, 507 
Two-dimensional, 
oscillator, 170 (Sec.5.3) 
polar coordinates, 26 
Typhoon, 350 

Unbounded orbits, 303, 313 
(Sec.8.7) 

Underdamped oscillations, 
176-177 

Unit matrix, 1, 384 
Unit vectors, 
e l> e 2> 5 

i, j, k, 4 
r and 0, 26 

r and 0, derivatives of, 27-28 
f, 0, 0, 135-136 
jo, 0, z, 40 (Prs. 1.47 and 1.48) 
x, y, z, 4 

Units, natural, 442 
Universality, of period doubling, 
475 

Unstable equilibrium, when 
d 2 U/dx 2 < 0, 125 

vdv/dx rule, 74 (Pr.2.12) 

Variable mass, 633, 637 
Variational principle, 215, 218 


Vector, 4 
cross product, 7 
curl of, 119 
differentiation of, 7 
dot product, 6 
product, 7 
scalar product,6 
sum, 6 

times scalar, 6 

two definitions of dot product, 
35 (Pr.1.7) 

unit, See Unit vectors 
Velocity, 

four dimensional, 634 
in 2D polar coordinates, 28 
Velocity-addition formula, 
classical, 328, 598 
relativistic, 616 
Vertical, definition of, 347 
Virial theorem, 158 (Pr.4.41), 323 
(Pr.8.17) 

Viscosity, r], 72 (Ft.) 

Volume force, 698 
u ter , terminal speed, 50, 60 

Washboard road, 188, 212 
(Pr.5.43) 

Wave, 

in fluid, 727 (Sec.16.13) 
in fluid is longitudinal, 729 
in rock, 722 (Ex. 16.8) 
in solid, 721 (Sec.16.11) 
longitudinal, in solid, 721 
on string, 682 (Secs. 16.1- 
16.3) 

plane, 695 
spherical, 696-697 
standing, 688 

transverse or shear, 721-722 


triangular, on finite string, 692 
(Ex. 16.2) 

triangular, on infinite string, 686 
(Ex. 16.1) 

Wave equation, 
for string, 684 
in terms of Laplacian, 695 
one-dimensional, 685 (Sec. 16.2) 
solution by separation of 
variables, 733 (Pr.16.9) 
three-dimensional, 694 
(Sec. 16.4) 

Weak damping, 175-177 
Weakly coupled oscillators, 426 
(Sec. 11.3) 

Weight is proportional to mass, 10 
Width of resonance, 189 
Work, 

as change in PE, 112 
done by force, 107 
in infinitesimal displacement, 
106 

Work-KE theorem, 108 
infinitesimal form, 106 
World line, 634, 671 (Pr.15.38) 

x, unit vector, 4 

y, unit vector, 4 

YM = Young’s modulus, 703 
in terms of Hooke’s law constants 
a & ft, 718 

related to BM & SM, 718 
Young’s modulus, See YM 
Yoyo, 283 (Pr.7.14) 

z, unit vector, 4 
Zero-component theorem, 671 

(Pr. 15.35) 



